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Introduction 


The  38th  Annual  Sanibel  Symposium,  orga¬ 
nized  by  the  faculty  and  staff  of  the  Quantum 
Theory  Project  of  the  University  of  Florida,  was 
held  on  February  21-27,  1998.  This  year,  the  Ponce 
de  Leon  Conference  Center  in  St.  Augustine, 
Florida,  was  the  site  of  the  gathering  of  more  than 
300  scientists. 

The  symposium  followed  the  established  format 
with  plenary  and  poster  sessions.  A  compact  7-day 
integrated  program  of  quantum  biology,  quantum 
chemistry,  and  condensed  matter  physics  was  pre¬ 
sented.  The  topics  of  the  sessions  covered  by  these 
proceedings  included  Spectroscopy  of  Base  Pairs, 
Quantum/Classical  Molecular  Mechanics,  Simula¬ 
tions  of  Biological  Systems,  Metals  in  Biology,  and 
Linear  Scaling. 

The  articles  were  subjected  to  the  ordinary  ref¬ 
ereeing  procedures  of  the  International  Journal  of 
Quantum  Chemistry.  The  articles  presented  in  the 
sessions  on  quantum  chemistry,  condensed  matter 
physics,  and  associated  poster  sessions  are  pub¬ 
lished  in  a  separate  issue  of  the  International  Jour¬ 
nal  of  Quantum  Chemistry. 

The  organizers  acknowledge  the  following 
sponsors  for  their  support  of  the  1998  Sanibel 
Symposium: 

■  Army  Research  Office  through  Grant 
#DAAG55-98-l-117.  "The  views,  opinions, 
and/or  findings  contained  in  this  report  are 
those  of  the  authors)  and  should  not  be 
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construed  as  an  official  Department  of  the 
Army  position,  policy,  or  decision,  unless  so 
designated  by  other  documentation." 

■  The  Office  of  Naval  Research  through  Grant 
#N00014-98-l-0215.  "This  work  relates  to 
Department  of  the  Navy  Grant  #N000 14-98- 
1-0215  issued  by  the  Office  of  Naval  Re¬ 
search.  The  United  States  Government  has 
the  royalty-free  license  throughout  the  world 
in  all  copyrightable  material  contained 
herein." 

■  IBM  Corporation 

■  HyperCube,  Inc. 

■  Q-Chem,  Inc 

■  The  University  of  Florida 

Very  special  thanks  go  to  the  staff  of  the  Quan¬ 
tum  Theory  Project  of  the  University  of  Florida  for 
handling  the  numerous  administrative,  clerical, 
and  practical  details.  The  organizers  are  proud  to 
recognize  the  contributions  of  Mrs.  Judy  Parker, 
Ms.  Coralu  Clements,  Ms.  Sandra  Weakland,  Dr. 
Greg  Pearl,  and  Mr.  Cristian  Cardenas.  All  the 
graduate  students  of  the  Quantum  Theory  Project 
who  served  as  "gofers"  are  gratefully  recognized 
for  their  contributions  to  the  1998  Sanibel  Sympo¬ 
sium. 

N.  Y.  Ohm 
J.  R.  Sabin 
M.  C.  Zerner 
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ABSTRACT:  Two  closely  related  N-substituted  valpromide  derivatives:  N-valproyl 
glycinamide  and  N-valproyl  glycine  are  comparatively  analyzed,  the  first  of  which  is 
antiepileptic  active  whereas  the  second  is  not.  The  study  is  based  on  a  conformational 
analysis  using  an  AMI  Hamiltonian  that  not  only  search  for  the  lower  energy  structures 
of  each  derivative  but  also  for  the  energy  involved  in  their  mutual  interconversion.  Open 
structures  have  been  compared  with  cyclic  ones,  the  latter  including  those  stabilized  by 
either  inter  or  intra  molecular  hydrogen  bonds  (dimers  and  monomers,  respectively). 
H-bond  formation  has  been  also  evaluated  by  means  of  ab  initio  G94(6~31  +  G(d,p)) 
calculations  for  a  smaller  system  (N-formylglycine/glycinamide)  modeling  both  vacuum 
and  solvent  conditions.  The  conformational  and  electronic  characteristics  of  the  open  and 
cyclic  monomers,  as  well  as  of  the  dimer  N-valproyl  glycinamide  and  N-valproyl  glycine 
structures  are  discussed.  On  the  basis  of  the  results  of  their  comparative  analysis,  we 
have  redefined  the  pharmacophore  previously  proposed  for  N-substituted  valpromides 
[Tasso,  Bruno-Blanch,  Estiu,  Int.  J.  Quant.  Chem.  65(6),  1107  (1997)],  relaxing  some  of  the 
associated  requirements.  The  corrected  model  requires  one  carbon  atom  or  any  bioisosteric 
substituent  in  an  anticlinal  conformation  relative  to  the  aminic  nitrogen  of  the  amide 
moiety,  in  addition  to  one  hydrogen  atom  that  should  be  antiperiplanar  to  the  carbonyl 
oxygen.  This  model  offers  an  explanation  to  the  different  response  of  N-valproyl 
glycinamide  and  N-valproyl  glycine  against  convulsion,  which  is  based  on  conformational 
restrictions.  ©  1998  John  Wiley  &  Sons,  Inc.  Int  J  Quant  Chem  70:  1127- 1136,  1998 
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Introduction 


7 -amino  butyric  acid  (GABA)  and  glycine  are 
among  the  most  important  inhibitory  neuro¬ 
transmitters,  which  play  an  important  role  in  the 
control  of  neuronal  activity  in  the  mammalian  cen¬ 
tral  nervous  system  (CNS)  and  are  thus  related  to 
convulsion  and  epilepsy  [1-8].  Consequently,  a 
tendency  has  developed  to  incorporate  GABA  and 
glycine  derivatives  into  the  newest  antiepileptic 
agents,  like  gabapentin  [9],  milacemide  [10],  and 
N-benzyloxycarbonylglycine  [11]  among  others. 

The  traditional  therapy,  on  the  other  hand,  in¬ 
cludes  valproic  acid  (vpa)  as  one  of  the  four  major 
antiepileptic  drugs  [12-14],  whose  main  advantage 
is  related  to  its  wide  spectrum  of  antiepileptic 
activity  [12].  One  of  its  main  disadvantages,  terato¬ 
genicity,  has  been  assigned,  on  the  basis  of  struc¬ 
ture-teratogenicity  relationships,  to  the  carboxylic 
moiety,  a  fact  that  has  deviated  the  research  effort 
to  the  study  of  the  derivatives  of  its  primary  amide, 
valpromide  (vpd)  [15].  Vpd  was  found  to  be  more 
potent  than  vpa  and  less  teratogenic  [16,  17].  How¬ 
ever,  the  importance  of  vpd  over  vpa  in  humans 
has  no  clinical  implications,  as  vpd  serves  as  a 
prodrug  of  vpa  in  humans  [18].  Therefore,  research 
in  this  line  is  presently  related  to  the  development 
of  stable  vpd  analogs  that  will  not  undergo  bio¬ 
transformation  to  the  corresponding  acid  [19-22]. 

Following  an  ongoing  research  centered  on  vpa, 
vpd,  and  their  derivatives  [19,  23],  in  this  study  we 
focus  our  interest  on  two  glycine-containing  com¬ 
pounds:  N-valproyl  glycine  (glyvpd)  and  N- 
valproyl  glycinamide  (glydvpd). 

N-valproyl  glycine,  a  minor  metabolite  of  vpa 
in  rats  [24],  has  not  shown  qualitative  antiepileptic 
activity  in  mice  [20].  N-valproyl  glycinamide  is  a 
more  recently  tested  compound,  more  effective 
than  vpa  [20].  Recent  pharmacokinetic  studies  have 
concluded  that,  in  dogs,  none  of  the  investigated 
compounds  serve  as  a  prodrug  or  a  chemical  de¬ 
livery  system  for  vpa  and  glycine.  Among  them, 
N-valproyl  glycinamide  shows  a  better  pharmaco¬ 
kinetic  profile,  a  fact  capable  of  explaining  its 
larger  antiepileptic  activity  [20]. 

A  previous  structure-activity  relationship  (SAR) 
analysis  of  several  N-substituted  derivatives  of 
vpd  [19]  has  allowed  us  to  identify  a  pharma- 
cophoric  pattern  that  has  to  be  complied  in  order 
for  the  compounds  to  be  active.  The  pharma¬ 


cophore,  shown  in  Figure  1,  was  mainly  related  to 
the  anticlinal  orientation  of  the  amide  function 
relative  to  the  hydrocarbon  chains  of  the  valproyl 
moiety.  Its  definition  involved  both  the  nuclear 
coordinates  and  the  local  charges  on  the  atomic 
centers.  Because  valproyl  glycine  and  valproyl 
glycinamide  are  also  N-substituted  valpromides, 
we  have  extended  the  conformational  analysis  to 
these  molecules  in  order  to  discern  whether  their 
stable  conformations  comply  or  not  with  the  defi¬ 
nition  of  the  pharmacophore.  Moreover,  from  their 
comparison,  our  goal  is  to  find  out  whether  their 
different  response  against  convulsion  can  be  ex¬ 
plained  on  a  structural  basis,  a  fact  that  would 
reinforce  the  concepts  derived  from  the  study  of 
their  pharmacokinetic  properties  [20]. 

On  the  basis  of  the  knowledge  that  N-acetyl 
glycine  stabilizes  as  a  dimer  structure  through 
intermodular  hydrogen  bonds  [25],  the  stability 
of  monomers  and  dimers  of  glyvpd  and  glydvpd 
has  been  compared  within  the  conformational 
study.  Cyclic  monomers,  stabilized  through  in¬ 
tramolecular  H  bonds  have  been  also  included  in 
the  comparative  analysis.  However,  the  strength  of 
the  H  bonds,  and  the  consequent  stabilization  of 
the  previously  described  structures,  is  largely  de- 


FIGURE  1.  Pharmacophore  proposed  for  N-substituted 
valpromides.  r5  is  defined  in  Figure  4.  Blue,  nitrogen; 
red,  oxygen;  light  blue,  carbon;  gray,  hydrogen;  yellow, 
alkyl  or  aryl  substituents. 
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termined  by  the  dielectric  constant  of  the  media.  In 
this  framework,  because  no  information  about  the 
valpromide  receptor  is  presently  accessible,  no  in¬ 
ference  can  be  made  about  the  polarity  of  the 
environment  in  the  interaction  site.  In  order  to 
gain  insight  in  the  influence  of  the  media  on  the 
biological  response,  calculations  in  vacuum  and 
for  the  solvent  simulated  by  water  as  a  continuum 
have  been  used  to  approach  low  and  high  polarity 
media,  respectively,  and  have  been  evaluated  in  a 
comparative  manner. 


Outline  of  the  Calculation  Procedure 

A  conformational  analysis  has  been  performed 
in  order  to  discern  whether  the  pharmacophore, 
shown  in  Figure  1,  is  defined  in  the  conformation 
of  minimum  energy  of  glyvpd  (Fig.  2)  and/or 
glydvpd  (Fig.3).  Because  the  size  of  the  molecules 
is  not  compatible  with  good-quality  ab  initio  calcu¬ 
lations,  an  AMI  model  Hamiltonian  [26]  (MOP AC 
7.0  package  [27])  has  been  chosen  for  the  confor¬ 
mational  search  in  vacuum,  which  implies  the 
comparison  of  open  and  cyclic  structures.  The 
choice  of  AMI  among  the  available  semiempirical 
methodologies  has  been  largely  justified  in  Refs. 
[12,  28]. 

For  the  open  monomers,  the  structures  associ¬ 
ated  with  the  initial  guesses  for  a  gradient-driven 
full-geometry  optimization  were  generated  by 


means  of  modifications  of  the  torsional  angles 
t5-t8  (Fig.  4)  and  of  those  defined  in  the  hydrocar¬ 
bon  chain  (t1-t4).  These,  and  the  other  geometry 
parameters  were  completely  relaxed  during  the 
optimizations.  In  this  framework,  the  conforma¬ 
tional  search  has  been  performed  as  follows: 

1.  The  t5  value  was  modified  in  90°  steps  from 
0°  to  270°  for  both  glyvpd  and  glydvpd.  In¬ 
termediate  values  were  not  considered  be¬ 
cause  all  the  optimizations  starting  from  the 
above-mentioned  ones  converged  to  values 
close  to  either  r5  =  0°  or  r5  =  180°. 

2.  For  each  of  the  r5  values,  r6  has  been  varied 
in  90°  steps.  In  a  similar  fashion  to  that  de¬ 
scribed  for  r5  two  minima  were  found,  asso¬ 
ciated,  respectively,  with  the  orientation  of 
the  hydrogen  atom  toward  09  (Fig.  4)  or 
opposite  to  it.  The  first  one  is  the  most  stable 
because  it  minimizes  steric  repulsion. 

3.  As  the  next  step  of  the  optimization,  modi¬ 
fications  of  t7  in  60°  and  t8  in  90°  steps 
have  been  performed  for  each  pair  of  r5,  t6 
values. 

4.  It  is  well  known  that  the  "all  trans"  confor¬ 
mation  is  the  most  stable  for  the  hydrocarbon 
chain.  A  thorough  discussion  of  this  subject 
can  be  found  in  Ref.  [23].  This  conformation 
has  been  confirmed,  however,  for  the  differ¬ 
ent  derivatives,  by  means  of  distortions  of 
the  t1-t4  angles  in  60°  and  90°  from  their 


FIGURE  2.  Most  stable  conformations  of  glyvpd.  (a)  Open  monomer,  (b)  Cyclic  monomer. 
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FIGURE  3.  Most  stable  conformations  of  glydvpd.  (a)  Open  monomer,  (b)  Cyclic  monomer. 


starting  180°,  followed  by  full  optimization  of 
the  resulting  structure. 

The  initial  structures  for  the  cyclic  monomers 
have  been  built  by  means  of  the  definition  of  the 
appropriate  combination  of  the  r6-r8  torsional 
angle  values  that  lead  to  the  stabilization  of  an 


FIGURE  4.  Atom  numbering  and  torsional  angles  in 
the  glydvpd  molecule.  N15  is  replaced  by  015  in  glyvpd. 
ri  =  C1C2C3C4,  t2  =  C2C3C4C5,  r3  =  C3C4C5C6, 
t4  =  C4C5C6C7,  r5  =  OgC8C4H11,  r6  =  O9C8N10C12, 
r7  =  C8N10C12C13,  t8  =  N10C12C13N15 


intramolecular  H  bond  between  the  carboxylic 
oxygen  of  the  valproyl  moiety  and  the  H  atom  of 
the  hydroxy  or  amide  group  of  the  gly  or  glyd 
moieties,  respectively. 

Cyclic  and  open  monomers  have  been  used  to 
build  the  dimers  (Figs.  5  and  6),  which  comprise 
H-bond  formation  between  the  carbonyl  oxy¬ 
gen  and  the  amine  nitrogen  of  the  glycine  moiety 
(Ou — N10).  Syn-  and  antiperiplanar  conformations 
of  the  monomers,  comprising  both  open  and  cyclic 
units,  have  been  used  to  build  the  starting  struc¬ 
tures.  Their  stability  has  been  compared  after  a  full 
geometry  relaxation. 

For  both  the  cyclic  and  open  monomers,  as  well 
as  for  the  dimeric  structures,  AMI  calculations 
have  been  also  used  to  evaluate  the  torsional  bar¬ 
rier  around  the  CC  bond  associated  with  t5.  The 
keyword  PRECISE  has  been  always  used  through¬ 
out  the  calculations. 

The  difficulties  associated  with  the  accurate 
quantum  chemical  description  of  the  interactions 
involved  in  H  bonds  are  well  documented  [29-31]. 
It  is  well  known  that  the  results  of  their  semi- 
empirical  evaluation  have  to  be  considered  with 
caution.  In  order  to  confirm  the  conclusions  from 
them  derived,  ab  initio  G-94(HF/6-31  +  G(d,  p) 
calculations  [32]  have  been  performed  for 
molecules  that,  being  smaller  than  glyvpd  and 
glydvpd,  retain  the  local  characteristics  in  the  moi¬ 
eties  involved  in  the  H  bonds:  N-formyl  glycine 
and  N-formyl  glycinamide.  The  stability  of  cyclic 
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and  open  monomers,  as  well  as  dimer  structures, 
has  been  compared  for  these  molecules,  at  this 
level  of  theory,  for  both  vacuum  and  solvent  simu¬ 
lated  conditions.  The  solvent  to  be  approached, 
physiological  media,  is  mainly  defined  by  water.  It 
has  been  modeled,  thus,  by  water  as  a  continuum 
within  an  Onsager  approach  [33]. 

Electronic  descriptors  have  been  derived  from  a 
Mulliken  population  analysis  [34]  performed  at  the 
AMI  level.  In  spite  of  the  lack  of  precision  of  this 
analysis  for  absolute  calculations,  their  results  are 
widely  accepted  in  this  field  for  the  study  of  the 
trends  in  their  variation  on  well-defined  atomic 


centers  that  follow  structural  modifications  per¬ 
formed  to  a  parent  structure  [35,  36]. 


Results  and  Discussion 

GLYVPD  AND  GLYDVPD  MONOMERS 

In  agreement  with  the  results  of  our  previous 
calculations  for  a  set  of  N-substituted  vpd  [19], 
two  minima  result  from  the  AMI  geometry  opti¬ 
mization  procedure,  which  are  related  to  values  of 
t5  close  to  0°  and  180°,  respectively,  and  define 


FIGURE  6.  Most  stable  conformation  of  dimeric  glydvpd. 
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synperiplanar  and  antiperiplanar  09  to  Hn  confor¬ 
mations  (Fig.  4).  According  to  the  results  shown  in 
Tables  I  and  II,  the  synperiplanar  conformation 
is  preferred  by  both  glyvpd  and  glydvpd.  The  pre¬ 
vious  discussion  is  valid  for  open  and  cyclic 
monomers.  Although  only  the  antiperiplanar 
conformation  has  been  found  to  be  associated 
with  the  antiepileptic  activity  [19],  the  energy  dif¬ 
ference  between  both  orientations  within  a  given 
cyclization  pattern  is  close  to  1  kcal/mol  (Tables  I 
and  II),  showing  that  both  conformers  can  coexist 
in  equilibrium.  Moreover,  the  calculated  energy 
barrier  for  their  mutual  interconversion  (close  to 
2.6  kcal/mol)  demonstrates  that  the  active  confor¬ 
mation  can  be  easily  attained,  at  a  low  energy  cost, 
in  the  receptor  site. 

Whereas  the  cyclic  conformation  is  more  stable 
for  the  amide  at  the  semiempirical  AMI  level,  the 
open  structure  is  preferred  for  the  acid  (Tables  I 
and  II).  Cyclization  does  not  imply,  however,  that 
the  structure  become  rigid,  and  the  torsional  free¬ 
dom  around  the  C4C8  bond  does  not  depend  on 
the  internal  array  of  the  glycine  moiety.  The  re¬ 
sults  of  the  ab  initio  calculations  for  the  isolated 
molecules  (Tables  III  and  IV)  are  in  close  agree¬ 
ment  with  the  semiempirical  ones,  stabilizing  in  a 
larger  extent  the  cyclic  structure  for  the  N-formyl 


glycinamide.  When  the  physiological  media  is 
modeled  by  water  as  a  continuum,  the  energy 
difference  between  open  and  cyclic  structures  re¬ 
mains  almost  unchanged,  showing  that  the  possi¬ 
bility  of  cyclization,  which  may  influence  the  inter¬ 
action  at  the  receptor  site,  is  not  dependent  on  the 
nature  of  the  environmental  solvent. 

It  can  be  concluded,  from  the  comparison  of  the 
energies  associated  with  the  different  stable 
monomeric  conformations  of  glyvpd  and  glydvpd, 
that  the  structural  requirements  imposed  by  the 
pharmacophoric  pattern  previously  defined  [19] 
can  be  easily  attained  by  both  molecules  at  a  very 
low  energy  cost,  because  of  their  rotational  free¬ 
dom  around  r5.  No  difference  between  them,  capa¬ 
ble  of  justifying  their  different  response  against 
convulsion,  can  be  derived  from  the  study  of  the 
isolated  units. 

GLYVPD  AND  GLYDVPD  DIMfiRS 

According  to  the  semiempirical  calculations, 
dimeric  conformations  are  more  stable  than  the 
monomeric  ones  (Tables  I  and  II).  Whereas  the 
coordination  of  open  units  leads  to  more  stable 
structures  than  the  cyclic  ones  for  glyvpd,  imply¬ 
ing  an  energy  gain  close  to  6  kcal/mol,  the  coordi- 


TABLE  I _ _ _ _ 

Stable  Conformers  of  N-valproyl-glycine  derived  from  the  AMI  conformational  analysis.8 


Conformer 

d  0— Hintra 

d  O— Hjnter 

t5 

T6 

T7 

Te 

AE 

Monomers 

1-syn 

>5.0 

5.8 

9.0 

113.6 

22.8 

0.0 

2-anti 

>5.0 

166.7 

13.0 

100.0 

6.7 

0.4 

3-syn 

>5.0 

0.4 

-166.2 

91.7 

56.4 

3.5 

4-anti 

>5.0 

168.6 

-174.6 

98.3 

50.2 

2.9 

5-syn 

2.2 

-2.6 

0.1 

81.3 

-69.5 

3.0 

6-anti 

2.1 

164.0 

-3.0 

80.9 

-70.3 

4.0 

Dimers 

7-syn-syn 

>5.0 

2.2 

1.7 

-2.1 

150.1 

-162.5 

-5.9b 

>5.0 

2.2 

4.0 

-0.3 

145.5 

-148.8 

8-anti-anti 

>5.0 

2.1 

-173.5 

6.8 

115.1 

-164.3 

-6.2b 

>5.0 

2.1 

-177.8 

5.9 

120.5 

-162.1 

9-syn-syn 

2.1 

2.1 

10.1 

-4.7 

80.6 

-113.5 

-3.3b 

2.1 

2.1 

13.1 

-2.1 

80.6 

-112.8 

1 0-anti-anti 

2.1 

2.1 

-175.7 

-2.2 

77.6 

-114.6 

-1.4b 

2.1 

2.1 

179.8 

-2.8 

79.4 

-113.2 

ad  o™Hjntra  =  distance  (A)  between  09  and  H33  of  the  same  molecule. 
d  O — Hinter  =  distance  (A)  between  014  and  H30  of  different  monomers. 
t5  =  dihedral  angle  defined  by  OgCg^H^  atoms. 
t6  =  dihedral  angle  defined  by  O9C8N10C12  atoms. 
r7  =  dihedral  angle  defined  by  C8N10C12C13  atoms. 
r8  =  dihedral  angle  defined  by  N10C12C13O15  atoms. 

AE  =  energy  difference  (kcal)  relative  to  the  most  stable  conformer  [AE  =  E  -  E(1.syn)]. 
b  =  energy  difference  (kcal)  relative  to  twice  the  energy  of  the  most  stable  conformer  [AE  =  E  -  2  x  E(1.syn)]. 
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TABLE  II 


Stable  Conformers  of  N-valproyl-glycinamide  derived  from  the  AMI  conformational  analysis.3 


Conformer 

d  0 — Hintra 

d  0  Hjntra 

r5 

r6 

T? 

r8 

AE 

Monomers 

1-syn 

2.2 

-2.7 

-2.6 

-80.7 

53.3 

0.0 

2-anti 

2.2 

159.3 

0.2 

80.7 

-63.0 

0.9 

3-syn 

>3.7 

3.3 

8.3 

143.6 

58.6 

4.4 

4-anti 

>3.7 

166.3 

168.8 

91.1 

-4.4 

4.8 

5-syn 

>3.7 

2.2 

-170.3 

90.9 

35.7 

5.9 

6-anti 

>3.7 

-169.3 

9.1 

122.4 

55.6 

5.3 

Dimers 

7-syn-syn 

2.2 

2.1 

5.6 

0.0 

81.0 

-107.6 

-10.1b 

2.2 

2.1 

2.0 

4.6 

77.8 

-111.0 

8-anti-anti 

2.2 

2.1 

174.2 

3.1 

79.2 

-104.6 

— 7.8b 

2.2 

2.1 

174.3 

1.8 

79.5 

-106.9 

9-anti-anti 

4.3 

2.2 

-171.4 

5.3 

111.1 

-146.7 

-1.8b 

4.3 

2.1 

-171.5 

6.2 

109.9 

-148.2 

d  O  Hjntra  distance  (A)  between  09  and  H33  of  the  same  molecule. 
d  O  Hinter  =  distance  (A)  between  014  and  H30  of  different  molecules. 
t5  =  dihedral  angle  defined  by  09C8C4H11  atoms. 
t6  =  dihedral  angle  defined  by  O9C8N10C12  atoms, 
r 7  «  dihedral  angle  defined  by  C8N10C12C13  atoms. 
t 8  =  dihedral  angle  defined  by  N10C-|2C13N15  atoms. 

AE  =  energy  difference  (kcal)  relative  to  the  most  stable  conformer  [AE  =  E  -  E(1_syn)]. 
b  =  energy  difference  (kcal)  relative  to  twice  the  energy  of  the  most  stable  conformer  [AE  =  E  -  2  x  E{1_syn)]. 


nation  of  cyclic  glydvpd  units  stabilizes  the  dimers 
in  almost  10  kcal/mol  more  than  that  of  open 
ones.  For  both  molecules,  syn~  and  antiperiplanar 
conformations  of  the  monomers  lead  to  structures 
of  similar  energies  after  dimerization. 

The  coordination  of  antiperiplanar  conforma¬ 
tions  retains,  in  the  dimer,  the  molecular  portion 
that  defines  the  pharmacophore  (Figs.  5  and  6).  On 
the  other  hand,  the  energy  difference  between  the 
anti-anti  and  syn-syn  conformers  is  as  small  as  the 
energy  difference  between  the  anti  and  syn  confor¬ 
mations  of  the  monomers  (Tables  I  and  II).  This 

TABLE  III _ 


Most  relevant  ab  initio  calculated  structural  data 
of  the  stable  conformers  of  N-formyl-glycine.a 


Con¬ 
former  d  O 

Hjntra  ^6 

T7 

Uj 

<1 

CO 

In  vacuo 

1 

5.4 

-176.0 

106.5 

35.3  0.0 

2 

2.0 

-6.0 

79.0 

-60.5  0.2 

In  solvent 

1 

6.1 

-171.3 

82.5 

172.7  0.0 

2 

1.8 

-13.1 

75.4 

-52.9  0.3 

ad  O — Hlntra  =  distance  (A)  between  09  and  H33. 
t6  =  dihedral  angle  defined  by  O9C8N10C12  atoms. 
t7  =  dihedral  angle  defined  by  C8N10C12C13  atoms. 
r8  =  dihedral  angle  defined  by  N10C12C13O15  atoms. 

AE  =  energy  difference  (kcal)  relative  to  the  most  stable 
conformer  [AE  =  E  -  E(1)]. 


fact  demonstrates  that  anti-anti,  anti-syn,  and  syn- 
syn  dimers  can  be  formed,  and  in  the  first  two 
cases  result,  according  to  our  structural  model  for 
the  pharmacophore,  in  biologically  active  struc¬ 
tures  for  both  glyvpd  and  glydvpd. 

The  N-substituted  carboxamide  portion  of 
dimeric  glyvpd  and  glydvpd  bears  a  disubstitu¬ 
tion  on  the  nitrogen  atom  (Figs.  5  and  6),  and, 
regarding  its  conformation,  complies  with  the  re¬ 
quirements  imposed  by  the  pharmacophore  when 
the  anti-anti  and  the  anti-syn  dimers  are  consid¬ 
ered  for  both  molecules.  However,  recent  pharma- 


TABLE  IV _ 

Most  relevant  ab  initio  calculated  structural  data 
for  the  stable  conformers  of  N-formyl-glycinamide.a 


Con¬ 
former  d  O 

Hjntra  ^*6 

t8  A  E 

In  vacuo 

1 

2.2 

2.8 

-85.3 

66.6  0.0 

2 

3.8 

-173.7 

-112.6 

8.8  2.4 

In  solvent 

1 

2.1 

7.7 

-82.9 

48.9  0.0 

2 

3.8 

-174.7 

-94.6 

-9.7  4.0 

ad  O— Hlntra  =  distance  (A)  between  Og  and  H33. 
t6  =  dihedral  angle  defined  by  O9C8N10C12  atoms. 
t7  =  dihedral  angle  defined  by  C8N10C12Ci3  atoms. 
t 8  =  dihedral  angle  defined  by  N10C12C13N15  atoms. 

AE  =  energy  difference  (kcal)  relative  to  the  most  stable 
conformer  [AE  =  E  -  E(1)]. 
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cological  tests  performed  in  our  laboratory  have 
demonstrated  that  disubstitution  on  the  nitrogen, 
when  it  implies  voluminous  groups,  larger  that 
ethyl,  leads  to  inactive  structures,  even  when  the 
conformational  requirements  imposed  by  the  phar¬ 
macophore  are  satisfied.  This  effect,  which  is 
presently  under  investigation,  seems  to  be  associ¬ 
ated  with  a  steric  hindrance  to  approach  the  recep¬ 
tor  site. 

Both  glycine  (glycinamide)  and  monomeric 
glyvpd  (glydvpd),  which  are  the  N-substituents  in 
the  dimers,  are  large  enough  to  block,  in  some 
way,  the  activity.  However,  there  is  another  — NH2 
group  in  the  glycinamide  moiety  of  glydvpd 
(Fig.  6).  Its  orientation  is  fixed  in  glydvpd  by 
dimerization,  synclinal  to  one  hydrogen  on  the  C 
atom  adjacent  to  the  one  to  which  it  is  bonded. 
This  H  atom  is  antiperiplanar,  thus,  to  the  oxygen, 
and  satisfies,  in  this  way,  the  geometric  require¬ 
ments  defined  by  the  pharmacophore  (Fig.  1). 
When  electronic  descriptors  are  considered,  the 
electronic  distribution  of  this  group  also  matches 
the  one  calculated  for  the  group  shown  in  Figure  1 
(Table  V).  This  group  can  be  considered,  thus, 
responsible  for  the  pharmacophoric  activity.  Pro¬ 
vided  that  this  group  can  approach  the  receptor 
site,  the  energy  involved  in  the  interaction  will  be 
large  enough  to  break  the  H  bond  that  may  hinder 
the  availability  of  the  carbonyl  oxygen.  The  com¬ 
parison  of  this  group  with  the  one  shown  in  Fig¬ 
ure  1  shows  that  the  similarities  between  them 
apply  to  the  H  atom  but  not  to  the  other  sub¬ 
stituents  of  the  tertiary  carbon  atom  (C12).  Whereas, 
in  agreement  with  our  model,  the  H  atom  is  oppo¬ 
site  to  the  carbonyl  oxygen,  the  other  substituents 
are  not  carbon  atoms,  but  one  nitrogen  and  one 


hydrogen.  This  evidence  can  lead  to  two  different 
conclusions: 

1.  We  can  redefine  our  pharmacophore  relaxing 
the  requirement  of  having  two  carbon-con¬ 
taining  groups  bonded  to  the  sp3  carbon  atom 
of  Figure  1.  The  corrected  model  requires  one 
carbon  atom  or  any  bioisosteric  substitution* 
in  an  anticlinal  conformation  relative  to  the 
aminic  nitrogen  of  the  amide  moiety,  in  addi¬ 
tion  to  the  hydrogen  atom  that  is  antiperipla¬ 
nar  to  the  carbonyl  oxygen.  No  requirements 
are  posed  on  the  nature  of  the  third  sub¬ 
stituent  of  the  carbon  atom. 

2.  We  can  reconsider  the  new  group  to  which 
the  activity  is  assigned.  It  becomes  evident 
that  the  portion  involved  resembles  more 
closely  glycinamide  than  vpd.  On  this  basis, 
and  with  the  knowledge  that  glycine  is  also 
an  inhibitory  neurotransmitter,  we  can  asso¬ 
ciate  the  antiepileptic  activity  of  glydvpd 
with  its  glycinamide  moiety.  A  question  is 
now  open  of  whether  the  activity  of  glydvpd 
is  originated  in  the  binding  of  either  the  vpd 
or  the  glyd  moieties  to  their  specific  receptor 
sites. 

We  strongly  support  the  first  conclusion  be¬ 
cause  the  second  one  does  not  explain,  again,  the 
lack  of  activity  of  glyvpd,  which  should  also  be 
capable  of  reaching  the  glycinergic  receptors. 

We  can  accept,  on  this  basis,  that  a  second 
pharmacophoric  group  in  the  glydvpd  molecule, 
which  is  lacking  in  glyvpd,  is  responsible  for  its 
antiepileptic  activity.  This  explanation  does  not 
have  to  disregard  the  pharmacokinetic  evidence, 

*  NH  substitutes  bioisosterically  a  C  atom  [37]. 


TABLE  V _ 

AMI  calculated  charges  on  the  atoms  that  define  both  pharmacophoric  groups.3 


Conformer 

q  °8 

q  o9 

q  Nio 

Q  C-13 

q  o14 

q  n15 

Pharmacophore 

+0.30 

-0.38 

-0.37 

+0.30 

-0.38 

-0.37 

glyvpd:  2-anti 

+0.30 

-0.35 

-0.35 

— 

— 

— 

glydvpd:  2-anti 

+0.31 

-0.39 

-0.37 

•0.27 

-0.38 

-0.43 

glyvpd:  10-anti-anti 

+0.31 

-0.37 

-0.38 

— 

— 

— 

+0.31 

-0.37 

-0.38 

— 

— 

— 

glydvpd:  11 -anti-anti 

+0.31 

-0.40 

-0.37 

+0.28 

-0.43 

-0.42 

+0.31 

-0.40 

-0.37 

+0.28 

-0.43 

-0.42 

aln  the  glyvpd  molecule  the  second  group  is  lacking. 
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which  shows  a  large  retention  time  (half-life)  of 
glydvpd  in  the  body,  in  dogs  [20],  a  fact  that 
improves  the  possibility  of  interaction  of  the  drug 
before  elimination.  Both  interpretations,  pharma- 
cokinetical  and  structural,  refer  to  different  steps 
of  the  interaction:  transport  inside  the  body  and 
interaction  at  the  receptor  site,  respectively.  Even 
if  a  lower  concentration  of  glyvpd  can  reach  the 
receptor  site,  it  will  not  show  antiepileptic  activity 
according  to  the  results  of  the  conformational  anal¬ 
ysis  and  the  structural  pattern  here  proposed. 


CONCLUSIONS 

We  have  performed  a  conformational  analysis 
of  glyvpd  and  glydvpd  at  a  semiempirical  AMI 
level,  considering  H-bond  formation  in  the  stabi¬ 
lization  of  monomer  and  dimer  structures.  The 
results  related  to  H-bond  formation  have  been 
confirmed  by  ab  initio  G94(6-31  +  G(d,p))  calcula¬ 
tions  for  a  smaller  system  (N-formylglycine/gly- 
cinamide),  for  both  the  isolated  molecules  and 
solvent  simulated  conditions. 

Both  methodologies  give  similar  results,  stabi¬ 
lizing  the  dimers  over  the  monomers,  and  favor¬ 
ing  the  cyclic  monomers  over  the  open  ones  for 
glydvpd.  In  relation  to  the  different  response  of 
glyvpd  and  glydvpd  against  convulsion,  we  con¬ 
clude  that  no  justification  can  be  given  on  the  basis 
of  the  structural  data  of  the  monomers.  Both  of 
them,  either  as  cyclic  or  as  open  units,  satisfy  the 
requirements  imposed  by  the  pharmacophore  that 
we  have  previously  proposed  [19]. 

Dimerization  leads  to  disubstitution  of  the 
aminic  nitrogen  of  vpd,  a  fact  that,  depending  on 
the  size  of  the  substituents,  has  been  found  to 
block  the  anticonvulsant  activity. 

We  associate  the  antiepileptic  activity  of  glyd¬ 
vpd  to  the  — NH2  group  of  the  glyd  moiety  which, 
while  not  present  in  glyvpd,  defines,  with  the 
adjacent  groups,  a  similar  pattern  to  the  one  shown 
in  Figure  1.  On  the  basis  of  their  differences  we 
have  redefined  our  pharmacophore.  The  corrected 
model  requires  one  carbon  atom  or  any  bioisos- 
teric  substituent  in  an  anticlinal  conformation  rela¬ 
tive  to  the  aminic  nitrogen  of  the  amide  moiety,  in 
addition  to  one  hydrogen  atom  that  should  be 
antiperiplanar  to  the  carbonyl  oxygen.  No  addi¬ 
tional  requirement  concerning  the  third  sub¬ 
stituent  of  the  sp 3  carbon  atom  of  Figure  1  is 
included  in  the  definition  of  the  pharmacophore. 
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ABSTRACT:  The  computed  molecular  surface  electrostatic  potentials  of  a  group  of 
anticonvulsants  of  various  chemical  types  were  investigated  with  the  objective  of 
identifying  common  features  that  may  be  related  to  their  activities.  The  calculations  were 
carried  out  with  the  density  functional  B3P86/6-31G*  procedure,  using  HF/STO-3G*- 
optimized  geometries.  Analysis  of  several  statistically  based  properties  of  the  surface 
potentials  indicates  that  the  negative  regions  are  of  primary  importance  and  that  an 
optimum  intermediate  level  of  local  polarity,  or  internal  charge  separation,  is  required. 
©  1998  John  Wiley  &  Sons,  Inc.  Int  J  Quant  Chem  70:  1137-1143,  1998 


Introduction 

Anticonvulsant  or  antiepileptic  drugs  are  com¬ 
pounds  that  are  found  clinically  to  help  control 
epileptic  seizures  [1-3].  They  cover  a  range  of 
chemical  categories,  which  act  upon  different  types 
of  convulsive  disorders.  For  example,  certain  bar¬ 
biturates  and  hydantoins  are  effective  against 
grand  mal  and  psychomotor  epilepsy,  while  some 
succinimides  are  used  against  petit  mal  epilepsy. 
The  mechanisms  of  action  of  these  and  other  anti¬ 
convulsants  are  not  fully  understood;  one  explana¬ 
tion  for  their  selectivity  is  that  they  interact,  ini¬ 
tially  or  finally,  with  different  receptors. 

The  interaction  of  a  molecule  with  a  receptor  is 
an  example  of  a  "recognition"  process,  in  which 

Correspondence  to:  P.  Politzer. 


the  receptor  recognizes  that  the  molecule  has  cer¬ 
tain  key  features  that  will  promote  their  interac¬ 
tion.  This  occurs  before  any  processes  of  bond 
breaking  or  bond  making  take  place.  Such  key 
features  have  often  been  identified  through  the 
analysis  of  the  electrostatic  potential  V(r)  that  is 
created  in  the  space  surrounding  a  molecule  by  its 
nuclei  and  electrons.  It  is  through  this  potential 
that  a  molecule  interacts  with  other  systems  in  its 
vicinity.  The  affinity  of  a  particular  molecule  for  a 
specific  receptor  has  been  shown  in  a  number  of 
cases  to  depend  upon  the  degree  to  which  the 
electrostatic  potential  of  the  former  possesses  cer¬ 
tain  characteristics  that  have  been  established  as 
being  necessary  for  effectively  interacting  with  that 
receptor  [4-11]. 

Our  objective  in  this  work  was  to  use  the  molec¬ 
ular  electrostatic  potential  V(r)  as  a  tool  for  com¬ 
paring  and  analyzing  a  large  group  of  anticonvul- 
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sant  drugs  of  various  chemical  types.  1-6  are 
derivatives  of  the  five-membered  hydantoin  ring. 
7-9  are  barbiturates,  related  to  the  six-membered 
heterocycle  barbituric  acid;  10  is  obtained  from  7 
by  the  reduction  of  one  carbonyl  group.  11-13  are 
carbamazepine  and  two  of  its  derivatives,  respec¬ 
tively.  14  is  an  acetyl  urea,  phenacemide.  15-17 
are  derivatives  of  the  five-membered  heterocycle 
succinimide,  and  18-21  are  an  assortment  of  other 
types.  A  striking  feature  of  most  of  these  21 
molecules  is  the  prevalence  of  ureide  and  amide 
linkages,  usually  in  cyclic  form.  All  of  them  except 
5  and  6  exhibit  anticonvulsant  activity,  to  varying 
degrees  [1-3].  Some  also  have  rather  severe  side 
effects,  including  skin  rashes,  fever,  and  dizziness, 
and  in  the  case  of  phenacemide  (14),  bone  marrow 
depression  and  hepatocellular  damage  [1]. 


Our  approach  was  to  analyze  the  electrostatic 
potentials  on  the  molecular  surfaces  of  1-21,  both 
qualitatively  in  terms  of  relative  patterns  of  posi¬ 
tive  and  negative  regions  and  quantitatively  using 
a  number  of  statistically  derived  descriptors.  We 
have  sought  to  identify  features  of  the  surface 
electrostatic  potentials  that  may  be  related  to  anti¬ 
convulsant  activity. 


Methods 

The  electrostatic  potentials  on  the  molecular 
surfaces  of  1-21  were  computed  with  the  density 
functional  B3P86/6-31G*  procedure,  using  struc¬ 
tures  optimized  at  the  HF/STO-3G*  level  and  the 
Gaussian  94  code  [12]: 
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Following  Bader  et  al.  [13],  the  surfaces  were  taken 
to  be  the  0.001  au  contour  of  the  molecular  elec¬ 
tronic  density,  p(r). 

The  electrostatic  potential  VTr)  created  in  the 
space  surrounding  a  molecule  by  its  nuclei  and 
electrons  is  given  rigorously  by  Eq.  (1): 


V(r)  = 


(  p(x')dr' 

J  Ir'  -  r|  • 


(1) 


ZA  is  the  charge  on  nucleus  A,  located  at  R^.  The 
sign  of  V(r)  at  any  point  r  is  the  net  result  of  the 
positive  and  negative  contributions  of  the  nuclei 
and  electrons.  Sites  reactive  toward  electrophiles 
can  be  identified  and  ranked  by  means  of  the 
locations  and  magnitudes  of  the  most  negative 
potentials,  either  on  the  molecular  surface  (Vs  min) 
or  in  three  dimensions  (Vmin),  while  the  most  posi¬ 
tive  surface  potentials  (Vs  x)  play  an  analogous 
role  for  nucleophilic  attack  [14-17]. 

To  extract  additional  information  from  the  elec¬ 
trostatic  potential  on  the  molecular  surface,  we 
have  introduced  several  statistical  quantities  that 
reflect  its  detailed  pattern  and  physically  meaning¬ 
ful  features  [18-21].  These  quantities  (II,  at20t,  and 


v )  are  given  by  Eqs.  (2)— (4),  which  involve  sum¬ 
mations  over  a  grid  of  points  covering  the  entire 
surface: 


n  =  :I  W(rf)  -  vs 


(2) 


1  Ul  n 

°~tlt  =  o-+  +  ul  =  —  £  [  V+(r;)  -  ys+  ] 

m  i=\ 


+  -  £  fv-(r,)  -  Vs  |2  (3) 


ni- 1 

1 

Vs  is  the  average  potential:  Vs  =  — 

V+(r,)  and  V~(r})  are  the  positive  and  negative 
values  of  V(r)  on  the  surface,  and  V$  and  V$  are 

the  averages:  V$  =  — 'L™=1V+(ri)  and  V$  = 
^  m 


2  _  2 


(7  7  (J 


V  = 


(14) 


II  is  the  average  deviation  of  the  potential  on 
the  surface;  it  is  interpreted  as  a  measure  of  the 
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local  polarity,  or  internal  charge  separation,  that  is 
present  even  in  molecules  with  zero  dipole  mo¬ 
ments,  such  as  BF3  and  para- dinitrobenzene 
[18,19,21].  a20t  is  the  sum  of  the  variances  of  the 
positive  and  negative  surface  potentials,  <r+  and 
a2,  respectively.  The  variance  is  a  measure  of  the 
spread,  or  range,  of  a  collection  of  values,  and  by 
definition  emphasizes  the  extremes.  cr\,  a2,  and 
are  viewed  as  indicating  the  net  positive, 
negative,  and  total  electrostatic  interaction  tenden¬ 
cies  of  a  molecule.  The  effectiveness  of  a20t  can  be 
increased  in  some  instances  by  combining  it  with 
an  index  of  "electrostatic  balance/'  This  refers  to 
the  degree  of  similarity  between  <j2  and  a2,  which 
indicates  the  extent  to  which  the  molecule  can 
interact  through  both  its  positive  and  its  negative 
surface  regions.  The  quantity  v  is  a  measure  of  this 
similarity;  as  a2  and  cr2  approach  each  other  in 
magnitude,  whether  they  be  large  or  small,  v  ap¬ 
proaches  an  upper  limit  of  0.250.  The  product  va20t 
is  an  important  term  in  representing  properties 
such  as  boiling  points,  critical  temperatures,  and 
heats  of  vaporization  and  sublimation,  in  which 
the  molecules  are  interacting  with  others  of  the 
same  kind  [18,19,21], 


Results 

Examples  of  the  molecular  surface  electrostatic 
potentials  of  1-21  are  shown  in  Figures  1-3,  for 
phenytoin  (1),  phenobarbital  (7),  and  carba- 


FIGURE  1 .  Calculated  electrostatic  potential  on  the 
molecular  surface  of  phenytoin  (1).  Potential  ranges,  in 
kcal/mol:  (red)  more  positive  than  30;  (yellow)  between 
15  and  30;  (green)  between  0  and  15;  (blue)  between 
-15  and  0;  (pink)  more  negative  than  -15. 


FIGURE  2.  Calculated  electrostatic  potential  on  the 
molecular  surface  of  phenobarbital  (7).  Potential  ranges, 
in  kcal  /  mol:  (red)  more  positive  than  30;  (yellow) 
between  15  and  30;  (green)  between  0  and  15;  (blue) 
between  -15  and  0;  (pink)  more  negative  than  -15. 

mazepine  (11).  In  general,  the  most  positive  poten¬ 
tials  (red  regions)  are  associated  with  amine, 
amide,  or  hydroxyl  hydrogens;  the  most  negative 
(pink  regions)  are  due  to  carbonyl  oxygens  and /or 
nitrogen  lone  pairs.  As  seen  in  Figures  1-3,  the 
local  maxima  and  minima  are  generally  in  close 
proximity,  on  protruding  portions  or  ends  of 


FIGURE  3.  Calculated  electrostatic  potential  on  the 
molecular  surface  of  carbamazepine  (11).  Potential 
ranges,  in  kcal/mol:  (red)  more  positive  than  30;  (yellow) 
between  15  and  30;  (green)  between  0  and  15;  (blue) 
between  -15  and  0;  (pink)  more  negative  than  -15. 
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TABLE  I _ 

Calculated  molecular  surface  properties. 


Surface 

area 

n 

2  2  2 
<*+  <*-  (Tfot 

^S.max  Vs>i 

Molecule 

(A2) 

(kcal  /  mol) 

(kcal/mol)2 

V 

(kcal  /  mol) 

Hydantoins 

Phenytoin  (1) 

270.0 

11.72 

69.6 

95.5 

165.1 

0.244 

47.4 

-34.1 

Mephenytoin  (2) 

249.9 

11.41 

43.8 

93.8 

137.5 

0.217 

43.3 

-33.9 

Ethotoin  (3) 

240.3 

11.64 

43.6 

120.4 

164.0 

0.195 

38.6 

-34.4 

A/,A/'-Dimethylphenytoin  (4) 

301.4 

10.26 

18.3 

105.6 

123.9 

0.126 

17.5 

-34.4 

p-Methylphenytoin  (5) 

290.7 

11.19 

62.9 

99.0 

161.9 

0.238 

46.7 

-34.7 

m-Nitrophenytoin  (6) 

297.1 

13.84 

92.6 

86.2 

178.9 

0.249 

50.7 

-31.7 

Barbiturates 

Phenobarbital  (7) 

244.7 

11.03 

86.7 

74.2 

161.0 

0.248 

44.7 

-28.6 

Mephobarbital  (8) 

262.8 

10.00 

50.2 

75.9 

126.1 

0.240 

43.6 

-28.4 

Metharbital  (9) 

221.4 

10.79 

43.3 

80.9 

124.2 

0.227 

43.1 

-28.2 

Desoxybarbiturate 

Primidone  (10) 

237.7 

12.77 

120.7 

102.9 

223.7 

0.248 

43.7 

-35.4 

Carbamazepines 
Carbamazepine  (11) 

259.6 

11.44 

44.4 

105.0 

149.4 

0.209 

33.3 

-42.2 

1-Hydroxycarbamazepine  (12) 

267.0 

12.36 

97.6 

97.4 

195.0 

0.250 

54.3 

-42.3 

Epoxycarbamazepine  (13) 

262.1 

11.87 

55.3 

110.7 

166.0 

0.222 

35.8 

-38.5 

Acetyl  urea 

Phenacemide  (14) 

215.5 

14.38 

113.4 

116.9 

230.4 

0.250 

47.4 

-40.2 

Succinimides 

Methsuximide  (15) 

236.7 

10.79 

22.1 

88.1 

110.3 

0.160 

21.1 

-32.9 

Ethosuximide  (16) 

179.3 

12.78 

46.2 

97.6 

143.8 

0.218 

43.6 

-33.3 

Phensuximide  (17) 

225.3 

12.66 

27.5 

116.5 

144.0 

0.155 

23.0 

-36.3 

Others 

Diazepam  (18) 

298.5 

10.08 

36.3 

88.2 

124.6 

0.206 

26.9 

-35.9 

Trimethadione  (19) 

177.2 

13.52 

17.4 

90.2 

107.6 

0.136 

20.5 

-35.7 

Felbatol  (20) 

259.9 

13.71 

108.4 

111.2 

219.6 

0.250 

46.4 

-39.2 

Lamotrigine  (21) 

249.6 

12.53 

99.4 

94.5 

193.9 

0.250 

38.5 

-38.7 

the  molecules.  The  computed  surface  properties 
of  1-21,  including  their  areas,  are  presented  in 
Table  I. 


Discussion 

Table  I  contains  at  least  three  representatives  of 
each  of  four  chemical  categories,  plus  six  molecules 
of  various  other  types.  While  the  near-ubiquity  of 
ureide  and  amide  linkages  is  a  common  element, 
there  are  also  substantial  differences.  There  is  a 
considerable  range  of  sizes,  the  surface  areas  being 
between  177  and  301  A2.  The  molecules  contain 
one  to  three  rings,  sometimes  fused,  which  may  be 
five-,  six-  or  seven-membered,  saturated  or  unsat¬ 
urated.  While  a  heterocyclic  ring  appears  to  be  a 


key  structural  constituent  in  many  instances,  this 
is  not  the  case  in  14  and  20. 

The  positive  regions  of  the  surface  electrostatic 
potentials  of  these  molecules  provide  further  con¬ 
trasts.  As  mentioned  above,  the  strongest  positive 
potentials,  with  VS  max  between  33  and  54 
kcal/mol,  are  produced  by  amine,  amide,  or  hy¬ 
droxyl  hydrogens.  However,  there  are  no  such 
hydrogens  in  4,  15,  and  17-19,  and  their  VSmax 
are,  consequently,  much  weaker,  between  17  and 
27  kcal /mol.  These  five  molecules  also  have  among 
the  lowest  <x2  and  v  values,  indicating  that  the 
positive  regions  on  their  surfaces  are  relatively 
weak. 

On  the  other  hand,  the  negative  surface  regions, 
while  less  extensive  in  area,  are  much  more  uni¬ 
form  in  strength.  The  VS  min  are  all  within  a  rela- 
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tively  narrow  range,  —28  to  —42  kcal/mol,  as  are 
the  m2,  74  to  120  (kcal/mole)2.  [In  contrast,  the  <j\ 
are  between  17  and  121  (kcal/mol)2.]  It  seems 
reasonable  to  infer  that  it  is  the  negative  potentials 
that  are  of  primary  importance  in  anticonvulsant 
activity. 

A  particularly  striking  point  of  similarity  among 
the  molecules  in  Table  I  is  the  local  polarity,  II.  In 
earlier  work  [18,19,22],  encompassing  well  over 
100  molecules,  mostly  organic,  we  found  n  to 
vary  between  2  and  24  kcal/mol;  most  often,  how¬ 
ever,  it  is  less  than  10  kcal/mol.  What  is  notable  in 
Table  I  is  that  17  of  the  21  n  values  are  between  10 
and  13  kcal/mol  and  the  largest  overall  is  14.38 
kcal/mol.  Thus,  the  internal  charge  separations  in 
these  molecules  are  quite  significant,  but  are  rather 
strictly  circumscribed  in  magnitude.  This  suggests 
a  need  for  a  substantial  but  not  excessive  degree  of 
hydrophilic  character. 

It  is  interesting  to  note  that  the  surface  electro¬ 
static  potentials  of  the  inactive  molecules,  5  and  6, 
do  not  differ  dramatically  from  those  of  the  others. 
Their  lack  of  anticonvulsant  activity  may  reflect  an 
interplay  of  several  factors.  For  example,  5  and  6 
are  among  the  largest  molecules  in  Table  I;  only 
two  others  are  slightly  larger.  Thus,  steric  effects 
could  be  involved.  5  and  6  also  have  among  the 
highest  VS  wnx  values;  this  increases  the  possibility 
of  a  nonproductive  interaction  with  some  negative 
site.  The  inactivities  of  5  and  6  might  be  the  results 
of  several  such  contributing  factors. 


Summary 

Our  investigation  of  the  molecular  surface  elec¬ 
trostatic  potentials  of  anticonvulsants  of  different 
chemical  types  has  identified  two  common  fea¬ 
tures  that  may  be  related  to  their  activities: 

(a)  Surface  regions  of  relatively  strong  negative 
potentials.  This  suggests  that  the  interac¬ 
tions  with  the  receptorfs)  involve  positive 
sites  on  the  latter,  which  may,  for  instance, 
be  acting  as  hydrogen-bond  donors. 

(b)  Local  polarities  within  a  rather  narrow  range 
of  intermediate  values.  This  may  reflect  a 
need  for  an  optimum  balance  between  hy- 


drophilicity  and  hydrophobicity,  such  that 
the  molecules  be  able  to  pass  through  the 
cell  membrane  but  not  enter  into  interac¬ 
tions  that  prevent  them  from  reaching  the 
appropriate  receptorfs). 
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ABSTRACT:  Higher  plants  use  the  protein  phytochrome  as  a  photosensor.  In 
physiological  temperatures  phytochrome  exists  in  two  forms:  Pr  and  Pfr.  The  chromophore 
of  phytochrome  is  an  open-chain  tetrapyrrole.  On  the  pathway  from  Pr  to  Pfr  four 
intermediates  (Lumi-R,  Meta-Ra,  Meta-Rb,  and  Meta-Rc)  can  be  distinguished,  while 
only  two  (Lumi-F  and  Meta-F)  can  be  seen  on  the  way  back  from  Pfr  to  Pr.  We  have  used 
the  x-ray  structure  of  the  C-Phycocyanin  protein  Fremyella  diplosiphon  bacteria  as  a 
template  to  build  a  model  (  ~  200  atoms)  that  includes  only  the  chromophore  and  five 
amino  acids  of  the  phytochrome  (Arg316-Cys321“His322-Leu323-Gln324)  around  it. 
Using  the  existing  experimental  evidences,  we  have  proposed  a  three-dimensional  (3D) 
structure  for  Pr,  Pfr,  and  intermediates  and  a  mechanism  for  the  photoisomerization  as 
well.  Structures  were  fully  optimized  using  AMI  (Unichem  package  on  a  Cray  J90- 
NACAD).  Using  the  INDO/S  method  of  Zerner  and  co-workers,  we  calculated  the 
absorption  spectra  of  the  model  compounds  and  compared  them  with  the  experimental 
data.  The  oscillator  strength  ratio  is  an  indicator  of  the  chomophore  conformation  in 
biliproteins.  The  calculated  spectra  reproduces  well  the  spectra  of  the  phytochrome  (Pr, 
Pfr,  and  intermediates)  except  for  the  lower  energy  band.  This  result  is  attributed  to  the 
small  number  of  amino  acids  in  the  models.  The  calculated  ratios  (/vis//uv  ”  /osc  °f 
visible  band  over  /osc  of  UV  band  and  f2/f\  ~  /osc  second  absorption  band  over  fosc  of 
first  absorption  band)  for  the  models  match  very  well  the  experimental  ratios  obtained 
for  the  phytochrome  (Pr,  Pfr,  and  intermediates).  This  supports  the  proposed  mechanism 
for  the  photoisomerization  process.  ©  1998  John  Wiley  &  Sons,  Inc.  Int  J  Quant  Chem  70: 
1145-1157,  1998 
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Introduction 

Organisms  can  use  light  in  two  ways:  either 
to  use  its  energy  to  keep  its  cells  functioning 
or  to  translate  optical  signals  into  some  kind  of 
biological  response.  Among  the  latter,  there  are  the 
so-called  visual  pigments,  rhodopsins  in  verte¬ 
brates  and  phytochrome  in  higher  plants.  Most 
biological  photosensors  are  photochromic,  i.e.,  af¬ 
ter  the  initial  photochemical  event  (a  photochemi¬ 
cal  reaction  in  a  generally  very  complex  biochemi¬ 
cal  cycle),  the  system  is  restored  to  its  initial, 
"ready"  state  [1]  (for  recent  reviews  on  light  signal 
transduction  in  plants,  see  [2-4]).  Higher  plants 
use  only  visible  light  for  photosynthesis,  but  they 
respond  to  a  much  wider  range  of  the  electromag¬ 
netic  spectrum,  including  the  ultraviolet  (UV)  and 
the  near-infrared  (or  far-red)  light.  Moreover,  be¬ 
cause  higher  plants  have  developed  a  very  sophis¬ 
ticated  sensory  apparatus,  they  are  able  to  identify 
the  direction  and  the  intensity  of  the  incoming 
light  [3].  The  very  diverse  biological  responses  to 
the  radiation  are  called  photomorphogenesis. 

The  red /near-infrared  region  of  the  spectrum  is 
recognized  by  photoreceptors  known  as  phy¬ 
tochromes.  Up  to  now,  five  phytochrome  genes 
have  been  cloned  in  Arabidopsis  thaliana,  which 
correspond  to  five  different  proteins  (phyA  to 
phyE)  [4,  5].  These  different  phytochrome 
molecules  have  specialized  photosensory  functions 
[6-8].  In  this  study,  we  will  be  referring  to  phy¬ 
tochrome  A,  the  most  studied  phytochrome.  All 
phytochrome  molecules  have  the  same  basic  struc¬ 
ture:  they  are  biliproteins  with  a  molecular  weight 
of  124-129  kDa  (1100-1170  residues)  and  a  single 
chromophore  bound  to  a  cysteine  residue  in  the 
~  70-kDa  N-terminal  [9].  Phytochrome  molecules 
exist  as  a  dimer,  with  dimerization  occurring 
through  the  C-terminal  region  [10].  The  chro¬ 
mophore  of  phytochrome  is  a  phycobilin,  an 
open-chain  tetrapyrrole  [11].  In  physiological  tem¬ 
peratures,  phytochrome  exists  in  two  forms:  a 
red-absorbing  inactive  form  (Pr,  660  nm)  and  a 
far-red-absorbing  active  form  (Pfr,  730  nm).  Satu¬ 
ration  of  the  environment  with  730-nm  light  in¬ 
duces  a  shift  in  the  equilibrium  between  the  two 
forms  toward  99%  of  the  Pr  form,  whereas  satura¬ 
tion  with  660-nm  light  shifts  the  equilibrium  to 
88%  of  the  Pfr  form,  thus,  triggering  a  physiologi¬ 
cal  response  [11]. 


The  primary  photochemical  event  is  a  double 
isomerization  at  the  methine  bridge  between  the 
pyrrole  rings  C  and  D  (Fig.  1)  [11].  However,  Z-E 
isomerization  of  the  chromophore  requires  some 
space  in  the  protein  pocket  for  rings  C  and  D. 
Minimum  space  would  be  required  for  a  corota¬ 
tion  of  the  single  bonds  between  C  and  D  rings 
( syn-anti )  [12].  Furthermore,  there  are  contradic¬ 
tory  results  between  Fourier  transform  infrared 
(FTIR)  and  Raman  resonance  methods  related  to 
the  protonation  state  of  the  pyrrole  nitrogen  of 
ring  B  (Fig.  1).  The  FTIR  method  shows  that  both 
Pr  and  Pfr  forms  are  protonated,  whereas  the  Ra¬ 
man  resonance  (RR)  spectrum  shows  a  protonated 
Pr  chomophore  and  a  deprotonated  Pfr  chro¬ 
mophore  [12]. 

Since  it  was  first  discovered  by  Borthwick,  Hen¬ 
dricks,  and  their  co-workers  in  the  1950s  [13], 
phytochrome  has  been  studied  by  many  spec¬ 
troscopy  methods,  including  time-resolved  absorp- 


FIGURE  1 .  Proposed  structure  of  the  phytochrome 
chromophore  in  the  native  protein  (Pr  and  Pfr  forms) 
[11]. 


1146 


VOL.  70,  NO.  6 


PHYTOCHROME  STRUCTURE 


tion  (TROD)  [14],  fluorescence  [15],  circular  di- 
chroism  [16],  FTIR  [17],  FT-RR  spectroscopy  [18], 
low-temperature  spectroscopy  [19],  nanosecond 
laser  flash  photolysis  [20],  and  femtosecond  T-re- 
solved  spectroscopy  [21];  on  the  other  hand,  up  to 
now  no  X-ray  cristallographic  structure  of  phy¬ 
tochrome  has  been  obtained.  Both  the  primary 
structure  and  some  aspects  of  the  secondary  struc¬ 
ture  of  phytochrome  from  several  different  vege- 
tals  have  been  determined  [22],  but  because  phy¬ 
tochrome  exists  in  very  low  concentrations  in 
plants,  the  preparation  of  single  crystals  for  X-ray 
analysis  remains  elusive. 

From  the  theoretical  point  of  view,  a  few  at¬ 
tempts  to  study  this  problem  have  been  made.  In 
1993,  Smit  and  co-workers  made  a  force  field  vi¬ 
brational  analysis  of  the  chromophore  using 
biliverdin  dimethyl  esters  as  model  compounds 
[23].  Also  in  1993,  Scharnagl  and  co-workers  stud¬ 
ied  the  chromophore  using  molecular  dynamics 
and  INDO-S  [24]  and  electrostatic  calculations  [25] 
to  develop  a  model  of  the  phycoerythrocyanin 
chromophore  (from  phycobiliproteins  obtained 
from  bacteria  for  which  some  X-ray  structures  had 
been  previously  obtained).  More  recently,  using 
semiempirical  and  ab  initio  techniques,  Korkin  and 
co-workers  calculated  some  neutral  and  proto- 
nated  pyrromethenes  of  biological  interest  [26]. 

Based  on  the  X-ray  structure  of  C-Phycocyanin 
protein  from  Fremyella  diplosiphon  [27],  Parker  and 
co-workers  [28]  modeled  the  phytochrome  chro¬ 
mophore  binding  pocket.  They  changed  20  residues 
around  the  chromophore  binding  site  of  C-Phyco- 
cyanin  to  the  corresponding  residues  of  Avena 
phytochrome  A  and  minimized  the  model  using 
sophisticated  matching  procedures,  which  in¬ 
cluded  AMI  geometry  optimization  of  the  chro¬ 
mophore  moiety. 

Relaxation  products  observed  in  the  conversion 
Pr-Pfr  or  Pfr-Pr  with  different  absorption  spectra 
are  described  as  intermediates.  From  Pr  to  Pfr,  at 
least  four  intermediates  can  be  distinguished,  while 
at  least  two  can  be  seen  from  Pfr  to  Pr.  Figure  2 
shows  schematically  the  phytochrome  cycle,  in¬ 
cluding  the  distinct  intermediates  as  detected  by 
time-resolved  and  low-temperature  absorption 
spectroscopy  [11]. 
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FIGURE  2.  Schematic  diagram  of  the  photoconversion 
of  the  two  phytochrome  forms.  Temperatures  given  are 
the  minimum  values  for  the  individual  relaxation  steps; 
the  times  indicate  the  order  of  magnitude  of  the  relaxation 
process  at  room  temperature  [11]. 


of  the  phytochrome  is  unknown.  We  used  the 
X-ray  structure  of  the  C-Phycocyanin  protein  of  F. 
diplosiphon  bacteria  [27]  as  a  template.  Then,  we 
built  a  reduced  model  including  only  the  chro¬ 
mophore  and  five  residues  around  it.  Our  aim  was 
to  evaluate  the  configurational  and  conformational 
changes  that  occur  on  the  chromophore  during 
photoisomerization.  We  also  decided  to  examine 
the  protonation  state  of  the  intermediates  and  of 
the  Pfr  form  (Fig.  2)  [12]. 

The  C-Phycocyanin  protein  coordinates  of  F. 
diplosiphon  were  obtained  from  the  Protein  Data 
Bank.  The  model  geometries  (Pr,  Pfr,  and  interme¬ 
diates,  ~  200  atoms)  were  fully  optimized  using 
AMI  [29,  30]  with  keywords  GNORM  =  1.0,  PRE¬ 
CISE,  within  the  MOPAC  program  version  7.0  on  a 
workstation  IBM  RISC/6000  and  the  UNICHEM 
package  on  a  Cray  J90  (NACAD-COPPE-UFRJ). 

Absorption  spectra  were  calculated  using  the 
INDO/S  method  of  Zerner  and  co-workers  [31, 
32].  Solvent  effects  were  included  using  a  self-con¬ 
sistent  reaction  field  (SCRF)  routine,  with  dielec¬ 
tric  constant  s  =  78.54  (the  chromophore  selected 
is  near  the  surface  of  the  protein). 


Results  and  Discussion 


Method  and  Computational  Details 

Due  the  small  amount  of  the  protein  in  the 
plant  tissue,  the  three-dimensional  (3D)  structure 


The  central  point  of  this  work  is  an  analysis  of 
the  electronic  absorption  spectra  of  Avena  phy¬ 
tochrome  A  (Pr  and  Pfr  forms)  and  the  photoprod¬ 
ucts  observed  during  conversion  between  these 
forms  [11].  As  the  spectra  of  tetrapyrroles  is  known 
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to  be  sensitive  to  conformation,  we  thought  it 
worthwhile  to  try  to  use  the  experimental  spectra 
to  model  the  unknown  conformations.  The  key  to 
this  is  the  empirical  evidence  that  the  ratio  of  the 
oscillator  strength  of  the  visible  band  to  the  UV 
(Soret)  band  is  an  indicator  of  the  conformation  of 
the  tetrapyrrole  chromophore  [33,  34].  The  five 
amino  acids  used  served  two  purposes:  to  mimic 
the  peptide  chain  of  the  protein  near  the  chro¬ 
mophore  and  to  provide  the  interactions  (protona¬ 
tion  or  otherwise)  needed  to  stabilize  the  different 
conformations  which  correspond  to  the  different 
intermediates.  Therefore,  the  first  step  was  to  de¬ 
velop  an  operational  model  of  the  phytochrome 
molecule. 

MODEL  CONSTRUCTION 

In  the  absence  of  an  X-ray  crystal  structure  of 
phytochrome,  among  other  structures  of  analogue 
chromophores  available  from  PDB  (Protein  Data 
Bank),  we  selected  to  use  C-Phycocyanin  of  F. 
diplosiphon  as  a  template.  The  C-Phycocyanin  pro¬ 
tein  has  three  chromophores  very  similar  to  the 
phytochrome's  chromophore  (the  only  difference 
being  in  the  D  ring:  the  C-Phycocyanin  chro¬ 
mophore  has  an  ethyl  group  bound  in  the  D  ring 
instead  of  a  vinyl  group).  Due  to  its  high  homol¬ 
ogy  and  because  the  amino  acids  close  to  the 
chromophore  play  the  same  role  as  in  phy¬ 
tochrome  [12],  we  have  selected  the  (3  unit  ( /3- 
CPC1)  chromophore  (Fig.  3). 

We  started  building  our  model  with  the  frag¬ 
ment  W-Cys84-Leu85-Arg86-Asp87  (W,  the  te¬ 
trapyrrole  chromophore,  is  in  the  Z,Z,Z  syn 


Bacteria  Unit 

Selected  aminoacid  sequence 


Are79 


Cvs84  -  Leu85  -  Arg86  -  Asp87 


Chromophore 


Phvtochrome 

Equivalent  aminoacid  sequence 
Are316  Cvs321  -  His322  -  Leu323  -  Gln324 

1  jP  ' 

Chromophore  - \ 

NH  i 


FIGURE  3.  Comparison  of  the  selected  amino  acid 
sequence  in  the  bacteria  /3-subunit  and  the  equivalent 
amino  acid  sequence  in  the  phytochrome. 


conformation).  In  doing  this,  we  took  into  consid¬ 
eration  that  the  chromophore  of  phytochrome  is 
similarly  linked  to  Cys321  and  that  it  has  the  same 
conformation.  We  then  included  Arg79  as  a  coun¬ 
terion  to  the  propanoate  side  chains  of  rings  B  and 
C  of  the  tetrapyrrole.  A  very  important  point  is 
that  full  optimization  of  the  model  did  not  signifi¬ 
cantly  change  the  conformation  of  the  tetrapyrrole 
in  relation  to  the  X-ray  structure.  The  root  mean 
square  (RMS)  of  superimposed  structures  was  1.4 
A  for  the  C-Phycocyanin  optimized  model  against 
the  C-Phycocyanin  crystal:  172  atoms  superim¬ 
posed,  H  excluded  [Fig.  4(a)].  This  suggests  that 
the  five  selected  amino  acids  are  sufficient  to  rep¬ 
resent  the  main  interactions  between  the  chro¬ 
mophore  and  the  protein. 

The  next  step  was  to  model  the  phytochrome 
chromophore.  To  do  this,  we  used  as  a  template 
the  AMI  optimized  structure  of  the  C-Phycocyanin 
model  described  above.  We  kept  the  coordinates  of 
the  backbone  atoms  of  the  peptide  chain  and  of  the 
chromophore  and  changed  its  ethyl  group  for  a 
vinyl  group.  Next,  we  made  the  following  substi¬ 
tutions:  Cys84  is  renamed  Cys321,  Leu85  -» 
His322,  Arg86  ->  Leu323,  and  Asp87  ->  Gln324. 
We  conserved  the  Arg79  fragment  coordinates  and 
renamed  it  Arg316.  The  new  model  was  fully 
optimized  at  the  AMI  level.  It  is  important  to 
notice  that  this  procedure  did  not  significantly 
change  the  conformation  of  the  tetrapyrrole  chro¬ 
mophore.  The  RMS  of  superimposed  structures 
was  1.3  A  for  the  phytochrome-optimized  model 
against  the  C-Phycocyanin  crystal:  150  atoms  su¬ 
perimposed;  H  excluded;  superimposition  back¬ 
bone  only,  except  for  the  Arg  and  Cys,  which  are 
conserved  in  both  structures  [Fig.  4(b)].  As  men¬ 
tioned  above,  we  took  this  as  a  confirmation  that 
the  five  selected  amino  acids  in  the  C-Phycocyanin 
protein  play  the  same  role  as  the  equivalent  amino 
acids  in  phytochrome. 

Asp87  is  supposed  to  balance  the  positive  charge 
of  the  protonated  chromophore  in  the  C-Phyco¬ 
cyanin  protein.  However,  no  aspartate  is  found  in 
the  corresponding  position  of  the  phytochrome 
sequence.  As  the  phytochrome  is  believed  to  be 
protonated,  at  least  in  the  Pr  form  [18],  the  positive 
charge  must  be  balanced  by  an  aspartate  or  gluta¬ 
mate  further  away  from  the  chromophore  [12].  If 
this  is  the  case,  then  the  conformational  changes 
that  occur  during  later  stages  of  the  phototransfor¬ 
mation  could  remove  the  negative  charge  from 
near  the  chromophore,  and  hence  lead  to  deproto¬ 
nation. 
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FIGURE  4.  (a)  Superimposed  structures  of  the  C-Phycocyanin-optimized  model  (green)  against  the  C-Phycocyanin 
crystal  (gray),  (b)  Superimposed  structures  of  the  phytochrome-optimized  model  (green)  against  the  C-Phycocyanin 
crystal  (gray). 
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A  POSSIBLE  MECHANISM  FOR  TIIF 
PHOTOCHEMICAL  CYCLE 

Nuclear  magnetic  resonance  (NMR)  and  RR 
studies  have  shown  that  the  Pfr  form  has  ZZE  anti 
configuration  and  is  deprotonated  [11].  Besides, 
these  techniques  have  shown  that  the  first  interme¬ 
diate  (Lumi-R)  on  the  pathway  Pr  — >  Pfr  has  a 
ZZE  anti  configuration  and  that  there  are  modifi¬ 
cations  on  the  hydrogen  bonding  network  involv¬ 
ing  interactions  of  the  N~H  groups  of  rings  B  and 
C  with  the  protein  environment  [18].  Therefore, 
the  first  photochemical  event  would  be  the  ZZZ 
syn-ZZE  anti  isomerization.  Raman  resonance 
studies  have  also  shown  that  the  primary  photo¬ 
chemical  event  on  the  pathway  Pfr  — >  Pr  is  the 
double-bond  reisomerization  at  the  methine  bridge 
C-D,  i.e.,  the  reversal  of  the  photoreaction  of  Pr 
[18].  So,  the  first  intermediate  (Lumi-F)  on  the 
pathway  Pfr  ->  Pr  has  ZZZ  syn  isomerism.  An¬ 
other  RR  study  has  shown  that  the  proton  release 
on  the  pathway  Pr  ->  Pfr  occurs  between  the 
Lumi-R  intermediate  and  Meta-Ra  intermediate 
[35]. 

Based  on  such  experimental  evidences  and  on 
the  schematic  photochemical  cycle,  first  proposed 
by  Eilfeld  and  Rudiger  [36],  we  propose  a  possible 
mechanism  for  intermediate  conversion.  We,  then, 
proceed  to  model  this  mechanism. 

The  sequence  starts  with  the  protonated  phy¬ 
tochrome  in  its  Pr  form,  which  has  ZZZ  syn  con¬ 
figuration  [18].  Its  positive  charge  is  balanced  by 
either  an  aspartate  or  glutamate  ion  hydrogen 
bonded  to  the  N+-H  group  of  ring  B  and  N-H 
group  of  ring  C.  The  primary  photochemical  event 
is  the  absorption  of  red  light  ( ~  660  nm)  leading  to 
ZZZ  syn  ->  ZZE  anti  isomerization  (Fig.  5).  Steric 
hindrance  between  ring  D  and  the  negative  amino 
acid  residue  disrupts  the  hydrogen  bond  (HB) 
network  by  moving  the  amino  acid  farther.  To 
counterbalance  the  positive  charge,  the  carbonyl 
group  of  Gln324  approaches  and  restores  the  HB 
network,  thus  forming  the  first  intermediate 
(Lumi-R).  Proton  transfer  from  the  N+-H  group 
of  ring  B  to  the  carbonyl  oxygen  of  Gln324 
(N---H  +  0=0  generates  the  second  intermediate 
(Meta-Ra).  A  subsequent  proton  transfer  to  the 
negative  amino  acid  displaced  in  the  first  step 
generates  the  Meta-Rc  intermediate.  We  believe 
that  the  major  differences  between  Meta-Rc,  Meta- 
Rb,  and  Pfr  are  conformational  because  in  all  three 
intermediates  the  chromophore  is  deprotonated, 
with  Pfr  as  the  most  stable  of  the  three. 


The  other  leg  of  the  cycle  starts  with  the  Pfr 
form.  Under  far-red  light  (730  nm)  ZZE  anti 
ZZZ  syn  isomerization  occurs.  Steric  hindrance 
from  ring  D  is  released  and  the  neutral  amino  acid 
generated  in  the  forward  leg  can  reprotonate  the 
nitrogen  atom  of  ring  B.  The  main  differences 
between  the  intermediate  Lumi-F  and  Meta-F  are 
probably  the  conformation  of  the  tetrapyrrole  chro¬ 
mophore  and  the  position  of  the  neutral  amino 
acid.  In  the  case  of  Lumi-F  and  Meta-F,  the  amino 
acid  is  probably  neutral  and  hydrogen  bonded  to 
the  nitrogen  atom  of  ring  B,  at  least  in  the  Meta-F 
intermediate  ( — O — H---N).  As  the  temperature 
increases  to  230  K  (Fig.  2),  the  structure  relaxes 
and  complete  proton  transfer  occurs  ( — O  --- 
H —  +  N),  closing  the  cycle  (Fig.  5). 

MODELED  INTERMEDIATES 

Having  thus  rationalized  the  chemical  transfor¬ 
mations  which  occur  upon  photoisomerization,  in 
order  to  establish  the  feasibility  of  our  proposal, 
we  proceeded  to  calculate  the  pertinent  structures 
at  the  AMI  level.  To  simplify,  we  used  an  acetate 
as  a  probe  instead  of  an  aspartate  or  a  glutamate. 
We  started  with  our  model  of  the  Pr  conformation, 
and  after  full  optimization  of  the  most  stable  con¬ 
formation,  we  obtained  the  acetate  anion  hydrogen 
bonded  to  the  chromophore.  Next,  we  calculated  a 
structure  forcing  ZZZ  syn  — >  ZZE  anti  isomeriza¬ 
tion  and  took  the  acetate  group  away  from  the 
chromophore.  After  full  optimization,  we  obtained 
a  structure  in  which  the  carbonyl  oxygen  of  Gln324 
replaced  the  acetate  group  as  hydrogen  bond  ac¬ 
ceptor,  as  expected  for  the  Lumi-R  intermediate. 
We  transferred  the  proton  from  the  chromophore 
(protonated  nitrogen  of  ring  B)  to  the  carbonyl 
oxygen  of  Gln324  and  fully  optimized  this  inter¬ 
mediate  (Meta-Ra).  We  saw  then  that  there  were 
no  relevant  hydrogen  bonds  in  the  model.  The 
next  step  was  to  force  proton  transfer  from  Gln324 
to  the  acetate  probe.  The  fully  optimized  structure 
corresponds  well  to  the  expected  structure  for  the 
Meta-Rc  intermediate.  The  nitrogen  of  ring  C,  the 
acetic  acid  probe,  and  the  carbonyl  oxygen  of 
Gln324  are  hydrogen  bonded.  In  order  to  build  a 
putative  Meta-Rb  structure  and  restore  the  Pfr 
model,  conformational  changes  and  hydrogen 
bonds  on  the  Meta-Rc  model  were  necessary.  To 
construct  a  model  of  Lumi-F  from  Pfr,  we  had  to 
force  the  isomerization  ZZE  anti  ->  ZZZ  syn  and 
to  approach  the  protonated  acetic  acid  to  the  nitro¬ 
gen  of  ring  B.  After  full  optimization,  we  obtained 
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FIGURE  5.  Proposed  mechanism  for  the  phytochrome  intermediates  (P  =  CH2CH2COO  "). 

a  stable  structure  that  we  assigned  to  Lumi-F.  In  droxyl  group  of  the  acetic  acid  and  the  nitrogen  of 

this  intermediate  the  hydroxyl  oxygen  of  acetic  ring  B.  After  full  optimization,  the  carbonyl  oxy- 

acid  is  hydrogen  bonded  to  the  N-H  group  of  ring  gen  of  the  probe  was  hydrogen  bonded  to  the 

C  (N — H---0 — H).  To  build  the  Meta-F  model,  we  N-H  group  of  ring  C  (C=0— -H — N),  and  the 

had  to  force  a  hydrogen  bond  between  the  hy-  hydroxyl  group  of  the  probe  was  hydrogen  bonded 
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Pr  model 


R 


Luni-R  model 


Meta-Rc  model 


Meta-Rb  model 


Pfr  model 


N*— 0  =  2.7  A 
N  — 0  =  27  A 


N+ — 0  =  2  8  A 
N  — 0  =  2.7  A 


Oj  — 0|  =  2.8  A 
N  —  0|  =  3  1  A 
N  —  02  =  3.6  A 

Os-O,  =2.7  A 
N  — O,  =28A 


N*— 0=12.1° 

ft 

l4— 0=6.4° 


Hi 

N-— 0=95 

ft 

N — 0  =  14.4 


V . 

Oft-O,  =39  8° 

ft 

N--0,  =31.4° 

ft 

N--O2  =23.8° 


ft 

Of- O,  =8.6° 

ft 

N--O,  =  20  4° 


02— 0,=2.7A  02-"0,=8.8° 


>H 

N2  —  0  =  36  A  N2'--0  =  24.20 

H 

O  —  N,  =34  A  0-N,=33.5° 


,H 

O— N,=3  5A  O— N,  =  165° 

H 

N2 -0  =  3.2  A  N2'--- 0=30.8° 


Meta-F  model 

FIGURE  6.  Relevant  hydrogen  bonds  of  the  fully  optimized  Pr,  Pfr,  and  intermediate  models. 
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to  the  nitrogen  of  ring  B  (N---H — O).  Complete 
proton  transfer  from  Meta-F  restored  the  original 
Pr  model,  thus  closing  the  cycle.  Figure  6  shows 
the  relevant  hydrogen-bonding  network  of  the  in¬ 
termediates.  Figures  7-10  show  the  fully  opti¬ 
mized  AMI  3D  structures  of  Pr,  Pfr,  and  interme¬ 
diates. 

Table  I  shows  the  principal  dihedral  angles  of 
the  fully  optimized  phytochrome  and  intermedi¬ 


Lumi-R  model 


FIGURE  7.  Fully  AMI  optimized  3D  structures  of  Pr 
and  Lumi-R. 


ates  forms  and  also  of  the  C-Phycocyanin  protein 
X-ray  structure.  These  results  show  that  the  fully 
optimized  ZZZ  syn  models  (Pr,  Lumi-F,  and  Meta- 
F  model)  have  almost  the  same  C-Phycocyanin 
X-ray  structure  conformation.  This  again  suggests 
that  the  amino  acid  sequence  in  the  C-Phycocyanin 
protein  plays  the  same  role  as  the  equivalent  amino 
acid  sequence  in  the  phytochrome.  Moreover,  be¬ 
cause  of  the  syn-anti  conformational  change,  the 


Meta-Rc  model 


FIGURE  8.  Fully  AMI  optimized  3D  structures  of 
Meta-Ra  and  Meta-Rc. 
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Meta-Rb  mode] 


Lumi-F  model 


Pfr  model 

FIGURE  9.  Fully  AMI  optimized  3D  structures  of 
Meta-Rb  and  Pfr. 

only  difference  between  the  ZZZ  syn  and  ZZE  anti 
configuration  (Lumi-R,  Meta-Ra,  Meta-Rb,  Meta- 
Rc,  and  Pfr  model)  in  the  phytochrome  models 
besides  the  double-bond  dihedral  angle  (Z-E  iso¬ 
merization)  is  the  y  dihedral  angle. 

ELECTRONIC  SPECTRA 

To  further  support  our  proposal,  we  calculated 
the  electronic  spectra  of  the  above  phytochrome 


Meta-F  model 

FIGURE  10.  Fully  AMI  optimized  3D  structures  of 
Lumi-F  and  Meta-F. 

models  and  compared  them  with  the  experimental 
(UV-Vis)  spectra  of  Pr,  Lumi-R,  Meta-Ra,  Meta-Rb, 
Meta-Rc,  Pfr,  Lumi-F,  and  Meta-F  [11].  Here,  it  is 
important  to  calculate  the  position  of  the  absorp¬ 
tion  bands,  but  it  is  more  important  to  obtain  the 
oscillator  strength  ratio  of  the  visible  to  the  UV 
(Soret)  bands  and  the  oscillator  strength  ratio  of 
the  second  absorption  band  to  the  first  absorption 
band.  These  ratios  are  considered  indicators  of  the 
conformation  of  the  tetrapyrrole  rings  in  bilipro- 
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TABLE  I _ 

Conformation  of  proposed  models. 


Crystal  structure  (bacteria)  158.0  0.5  143.0 

Pr  model  129.2  -1.0  145.0 

Lumi-R  model  130.5  -1.7  -59.2 

Meta-Ra  model  1 32.7  - 1 0.8  -  25.3 

Meta-Rb  model  130.6  -3.0  -74.6 

Meta-Rc  model  139.9  -3.5  -66.5 

Pfr  model  141.2  -6.5  -60.9 

Lumi-F  model  140.1  3.1  143.4 

Meta-F  model  140.4  -5.2  134.7 


teins  and  in  the  free  chromophores.  When  the 
chromophore  conformation  is  extended,  the  /VIS/ 
/uv  ratio  is  over  3.0;  when  it  is  semicyclic  or 
semihelical,  the  /Vis//uv  ra^°  *s  between  0.8  and 
2.0;  when  it  is  cyclic  or  cyclohelical,  the  /Vis//uv 
ratio  is  ~  0.4;  and,  finally,  when  the  chromophore 
conformation  is  cyclic,  the  /Vis//uv  rah°  *s  ^  0.15 
[28]. 

Table  II  shows  the  calculated  absorption  spectra 
of  the  phytochrome  models  (Pr,  Pfr,  and  interme¬ 
diates).  Calculations  show  three  systems  of  peaks 
of  mainly  tt  character,  which  correspond  to  the 
experimental  spectrum  of  phytochrome  obtained 
by  Thiimmler  and  Rudiger  [11].  The  two  band 
systems  of  higher  energy  are  in  very  good  agree¬ 
ment  with  the  observed  spectra.  The  low  energy 
band,  however,  shows  only  poor  agreement.  This 
might  be  due  to  the  size  of  the  model  we  have 
chosen  since  only  five  amino  acid  residues  would 
hardly  account  for  all  field  effects  found  in  the 
chromoprotein.  There  is  experimental  evidence 
supporting  this  latter  hypothesis  [34,  37-39]. 

As  stated  above,  the  oscillator  strength  ratio  is 
considered  an  indicator  of  the  chromophore  con¬ 
formation  in  biliproteins  [28].  Here,  we  used  the 
experimental  oscillator  strength  ratio  of  the  second 
absorption  band  over  the  first  absorption  band 


TABLE  II _ 

Comparison  of  spectroscopic  data  (UV-Vis)  with  the  calculated  spectra  of  the  proposed  models. 


Form 

j/(1)  (cm  h 

v(2)  (cm  h 

i/(3)  (cm  1) 

Pra 

14,992 

26,315 

35,714 

Pr  model 

19,607 

28,653 

37,453 

Lumi-Ra 

14,430 

26,041 

n.a. 

Lumi-R  model 

20,408 

27,472 

36,101 

Meta-Raa 

15,082 

25,906 

n.a. 

Meta-Ra  model 

20,202 

28,011 

34,013 

Meta-Rba 

1 5,037 

26,315 

n.a. 

Meta-Rb  model 

21,459 

26,041 

35,087 

Meta-Rca 

13,793 

25,839 

n.a. 

Meta-Rc  model 

20,618 

25,252 

34,482 

Pfra 

13,495 

24,813 

35,714 

Pfr  model 

20,894 

26,659 

34,746 

Lumi-Fa 

14,858 

25,773 

n.a. 

Lumi-F  model 

21,141 

27,700 

38,610 

Meta-Fa 

15,151 

26,246 

n.a. 

Meta-F  model 

21 ,052 

26,954 

34,364 

a  Experimental  spectra  [11]: 

(1),  first  absorption  band;  (2),  second  absorption  band;  (3),  third  absorption  band. 
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TABLE  III 


Comparison  of  experimental  and  calculated  oscillator 
strength  ratio.3 


Form 

f2/f , 

^VIS^UV 

Pr 

1.10 

1.36 

Pr  model 

0.70 

1.25 

Lumi-R 

0.86  (0.73) 

n.a. 

Lumi-R  model 

0.83 

n.c. 

Meta-Ra 

1.37  (1.24) 

n.a. 

Meta-Ra  model 

1.01 

n.c. 

Meta-Rb 

2.29  (2.16) 

n.a. 

Meta-Rb  model 

2.93 

n.c. 

Meta-Rc 

1.45  (1.32) 

n.a. 

Meta-Rc  model 

1.73 

n.c. 

Pfr 

1.10  (0.97) 

0.88 

Pfr  model 

1.70 

0.72 

Lumi-F 

1.02  (0.89) 

n.a. 

Lumi-F  model 

0.78 

n.c. 

Meta-F 

1.15  (1.02) 

n.a. 

Meta-F  model 

0.72 

n.c. 

afVis^uv_V^2  +  The  values  in  parenthesis  excluded 
0.13,  the  contribution  of  the  equilibrium  Pr-Pfr. 


(/2//1)  [11]  aRd  *he  experimental  oscillator  strength 
ratio  of  the  visible  band  (first  absorption  band,  /Q 
over  the  UV  band  (second  absorption  band,  /2, 
plus  third  absorption  band,  /3)  (/Vis//uv)  [28]. 
Saturation  with  660-nm  light  shifts  the  equilibrium 
to  the  88%  Pfr  form  [1].  As  a  result,  we  discounted 
this  contribution  from  the  intermediates  f2/f\  ra_ 
tio  and  from  the  Pfr  f2/f\  ratio.  The  /Vis//uv  ra^° 
of  Pfr  was  obtained  excluding  the  Pr  contribution 
[28].  Therefore,  we  concluded  that  the  oscillator 
strength  ratios  calculated  for  the  phytochrome 
models  (Pr,  Pfr,  and  intermediates)  are  in  good 
agreement  with  the  experimental  results  (Table 
III). 


Concluding  Remarks 

Except  for  the  lower  energy  band,  the  calculated 
spectra  reproduce  well  the  spectra  of  the  phy¬ 
tochrome  (Pr,  Pfr,  and  intermediates).  This  result 
is  attributed  to  the  small  number  of  amino  acids 
included  in  the  models  (five  amino  acids  + 
chromophore). 

The  calculated  ratios  (/Vis//uv  and  /2//1)  f°r 
the  models  match  very  well  the  experimental  ra¬ 
tios  obtained  for  the  phytochrome  (Pr,  Pfr,  and 
intermediates).  This  supports  the  proposed  mecha¬ 
nism  for  the  photoisomerization  process. 
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ABSTRACT:  We  follow  the  initial  activation  of  the  nitrogen  molecule  at  the  FeMo 
cofactor  of  nitrogenase  and  subsequently  model  the  hydrogenation  of  N2  up  to  the  fourth 
protonation  step  using  the  intermediate  neglect  of  differential  overlap  quantum-chemical 
model.  The  results  obtained  favor  a  reaction  mechanism  going  through  hydrazido 
intermediates  on  the  4-Fe  surfaces,  externally  to  the  FeMo  cofactor.  Calculations  using 
density  functional  theory  on  smaller  model  systems  also  support  the  suggested 
mechanism  over  other  possible  schemes  that  involve  early  release  of  the  first  molecule  of 
ammonia  as  a  product  of  the  enzymatic  reaction.  We  also  demonstrate  that  dielectric 
stabilization  due  to  the  protein  around  the  cofactor  could  lower  markedly  the  barrier  for 
the  product  release  as  an  ammonium  ion.  ©  1998  John  Wiley  &  Sons,  Inc.  Int  J  Quant  Chem 
70:  1159-1168,  1998 

Key  words:  nitrogenase;  nitrogen  fixation;  INDO;  DFT,  PM3tm 


INTRODUCTION 

The  mechanisms  of  biological  nitrogen  fixation 
(nif)  have  been  a  challenge  for  scientists  for 
over  a  hundred  years,  following  the  discovery  of 
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the  bacterial  process  by  Hellriegel  and  Willfart  in 
1886  [1].  Much  of  the  interest  in  this  area  stems 
from  the  important  role  that  nitrogen  compounds 
play  in  biology  and  commerce  and  that  no  indus¬ 
trial  process  exists  that  competes  well  with  nature. 
The  interest  in  this  area  has  increased  dramatically 
in  the  last  several  years  after  the  solution  of  the 
X-ray  structure  of  the  enzyme  nitrogenase  (N2ase) 
at  near-atomic  resolution  for  different  nif-type  bac¬ 
teria  [2—4].  Various  types  of  nitrogenases  (FeMo, 
FeV,  and  Fe  only)  have  been  reported  exhibiting 
similar  chemistry  [3].  Among  other  similarities,  the 
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presence  of  a  six-atom  prism  built  of  coordination- 
ally  unsaturated  Fe  atoms  at  the  center  of  the 
metal  cofactor  has  been  the  primary  target  of  theo¬ 
retical  speculation  on  the  possible  mechanism  that 
involves  the  interaction  of  the  substrate(s) — N2 
and  a  number  of  other  small  molecules — with  the 
metal  cluster  [2-4].  Several  theoretical  models  were 
examined  and  reviewed  critically  in  the  literature 
in  the  last  few  years  [5-18], 

In  previous  works  [8,  9]  we  have  proposed  and 
examined  a  theoretical  model  for  the  active  site  of 
Azotobacter  vinelandii  FeMo  cofactor  based  on  the 
experimental  structure  by  Rees  et  ah  [2],  spectro¬ 
scopic  evidence  for  the  ground  electronic  state  of 
the  cluster  and  the  intermediate  neglect  of  the 
differential  overlap  (INDO)  as  a  theoretical  method 
[8].  We  have  shown  that  the  initial  activation  of 
the  N2  molecule  inside  the  cofactor  is  favored  [8] 
and  the  access  of  the  substrate  to  the  cluster  inte¬ 
rior  can  be  further  facilitated  by  the  electron  addi¬ 
tion  to  the  cofactor  prior  or  during  the  enzymatic 
reaction  [9,  10].  We  have  compared  two  possible 
models  for  the  active  site  having  39  and  41  open- 
shell  electrons  and  have  shown  that  these  two 
models  produce  very  similar  results  [8,  9].  The 
spin  distributions  within  the  native,  reduced,  and 
oxidized  forms  of  the  FeMo  cofactor  were  also 
examined  using  the  projected  unrestricted  Har- 
tree-Fock  (PUHF)  methodology  [9]  implemented 
recently  in  the  ZINDO  program  [11]  and  have 
reproduced  to  a  good  extent  the  available  experi¬ 
mental  data,  pointing  to  a  significant  delocaliza¬ 
tion  of  the  electron  density  over  the  metal  open 
shells  (d-orbitals),  and  the  existence  of  two  distinct 
groups  of  unpaired  spin  on  the  Fe  atoms  coupled 
anti-ferromagnetically  through  the  three  fi- sulfur 
bridges  [9], 

In  this  work  we  examine  further  the  above 
theoretical  model  [8]  in  attempts  to  provide  infor¬ 
mation  on  the  reaction  intermediates  that  can  pos¬ 
sibly  form  during  the  hydrogenation  of  the  acti¬ 
vated  N2  molecule.  As  in  our  previous  work  we 
use  the  INDO  model  in  the  restricted  open-shell 
Hartree-Fock  (ROHF)  approximation  and  the 
model  structure  discussed  in  detail  earlier  [8]. 
Density  functional  theory  (DFT),  employing  the 
Becke-Lee-Yang-Parr  (BLYP)  functional  [12]  and 
a  double-zeta  basis  set  (DZVP)  with  polarization 
including  the  Turbomole  program  [13],  supple¬ 
ment  parts  of  this  work,  especially  in  the  cases 
where  a  preference  is  given  to  a  particular  inter¬ 
mediate  over  other  possible  structures. 


Results  and  Discussion 

Several  nif  reaction  schemes  have  been  pro¬ 
posed  and  discussed  in  the  literature  [4].  Among 
them,  the  reaction  model  by  Thorneley  and  Lowe 
[14]  has  been  well  established  based  on  kinetic 
studies  of  the  reaction.  This  model  suggests  the 
early  release  of  the  first  ammonia  molecule  as  a 
most  probable  event  in  the  course  of  the  nif  reac¬ 
tion.  Other  schemes  have  also  been  studied  by 
Coucouvanis  and  co-workers  [15]  on  the  basis  of 
model  Mo /Fe /S  compounds  showing  a  particular 
reactivity  with  hydrazine  and  leading  to  its  reduc¬ 
tion  to  ammonia  in  which  the  heteroatom  (Mo  or 
V)  plays  an  important  role.  As  N2H4  is  on  the 
nitrogen  fixation  reaction  path  [16],  it  could  be  that 
the  biological  process  may  turn  to  the  formation  of 
hydrazine  (fully  or  only  partly)  during  the  reac¬ 
tion.  Studies  on  model  nif-Fe/S  complexes  done 
by  Sellmann  and  Sutter  [17]  also  point  strongly  to 
such  a  possibility.  In  addition,  chemical  quenching 
of  the  bacterial  nif  reaction  does,  indeed,  show  the 
presence  of  hydrazine  [14];  however,  the  observed 
N2H4  can  also  be  formed  from  N — NH2  or  N=NH 
intermediates  when  a  strong  acid  or  base  is  used 
to  halt  the  nif  reaction  and  intercept  the  intermedi¬ 
ates.  Other  types  of  nitrogenase,  such  as  the  FeV 
type  with  a  similar  structure  [3],  also  produce 
N2H4  as  a  regular  product  of  the  bacterial  nif 
reaction. 

In  any  case,  the  reaction  mechanism  is  presently 
unclear. 

Figure  1  shows  the  catalytic  cycle  of  the  N2 
fixation  process  the  way  we  model  it  through 
single  H  steps.  It  starts  and  ends  with  the  bare 
model  cofactor  and  involves,  as  a  first  step,  the 
incorporation  of  the  N2  molecule  in  the  cluster 
interior  where  it  forms  multiple  Fe — N  bonds  with 
the  six  irons  of  the  central  prism,  as  shown  in 
Figure  2.  The  number  of  metal-nitrogen  bonds, 
however,  may  vary  given  the  substrate  mobility 
and  the  dynamics  of  the  system  due  to  vibrations 
or  electron  transfer  to  the  active  site  which  we 
studied  in  some  detail  [8,  9].  Spatial  limitations, 
however,  do  not  allow  the  cluster  at  its  present 
structure  to  accommodate  any  of  the  intermedi¬ 
ates,  and  we  find  further  stages  of  the  hydrogena¬ 
tion  process  to  occur  outside  the  cofactor,  predom¬ 
inantly  on  the  4-Fe  face  described  by  Dance  [6]. 
Besides  the  largest  contact  area  provided  by  this 
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FIGURE  1.  A  schematic  for  the  reaction  N2  +  8H++  8e  ->  2NH3  +  H2  catalyzed  by  the  enzyme  nitrogenase.  In  the 
present  modeling,  the  evolution  of  H2  has  not  been  examined.  It  is  tempting  to  speculate,  however,  that  these  added 
protons  that  eventually  evolve  as  H2  are  used  to  free  the  Mo  binding  site  for  hydrazine  migration. 
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4-Fe  site  (compared  to  the  other  two  4-Fe  faces),  it 
is  also  less  crowded  by  the  protein  environment 
(nearest  residue  Arg-359)  as  seen  from  the  experi¬ 
mental  structure  [2]  and  discussed  widely  in  the 
literature  [4,  5].  The  protein  and  the  cluster  dy¬ 
namics  may,  however,  modify  the  picture  signifi¬ 
cantly  and  allow  for  intermediates  located  on  the 
two  other  alternative  4-Fe  places  that  are  available 
in  the  cluster. 

Figure  3  shows  the  minimized  positioning  of 
the  first  N — NH  intermediate  on  the  FeMo  cofac¬ 
tor  surface.  The  structure  has  been  obtained  after 
the  addition  of  an  electron  to  the  initially  activated 
cofactor-N2  system  and  subsequent  protonation  of 
the  N2  molecule.  Although  we  find  occasionally  a 
small  difference  between  the  structures  obtained 
after  the  addition  of  an  H  atom  to  the  model 
system  and  that  obtained  via  three-step  hydro¬ 
genation  (electron  addition,  relaxation,  and  proto¬ 
nation),  we  prefer  the  latter  type  of  modeling,  as  it 
seems  more  natural  and  is  supported  by  experi¬ 
mental  findings  implying  similar  sequence  of 


events  in  Fe/S  systems  [18].  As  in  our  previous 
work  [8],  we  allowed  for  a  complete  freedom  of 
movement  of  the  substrate  around  the  cofactor 
and  we  used  the  configuration-averaged 
Hartree-Fock  methodology  in  its  ROFiF  form  to 
obtain  the  pure  spin  states  [19].  The  geometry 
optimizations  and  the  relative  energies  that  follow 
from  them  were  calculated  using  the  default  set  of 
geometric  parameters  in  ZINDO,  while  the  re¬ 
ported  electronic  properties  are  obtained  using  the 
spectroscopic  model  (INDO/S)  on  the  relaxed 
structures.  We  have  applied  this  approach  success¬ 
fully  in  a  number  of  studies  [20]. 

The  calculated  geometric  and  electronic  proper¬ 
ties  (distances,  atomic  charges,  bond  indices)  for 
the  intermediates  that  we  calculate  in  this  work 
are  summarized  in  Table  I.  As  seen  in  this  table, 
the  charge  on  the  nitrogen  atoms  remains  negative 
and  declines  during  the  hydrogenation,  due  to  the 
charge  transfer  from  the  metal  tf-orbitals  into  the 
77*  orbitals  on  the  N2  system.  The  hydrogenation 
has  a  relatively  small  effect  on  the  N — N  bond, 


FIGURE  3.  The  most  favored  position  of  the  N2H1  intermediate  on  the  4-Fe  face  of  the  obtained  after  energy 
minimization  of  the  substrate. 


TABLE  I _ _ _ 

Calculated  N-N  bond  lengths  (A),  Mulliken  charges  on  the  two  nitrogen  atoms,  g(N),  and  the  N~N  Wiberg 
bond  index  (b.i.)  between  the  N  atoms  for  the  reaction  intermediates  obtained  after  a  geometry  optimization. 


System 

Ann 

g(  n,) 

<7(N2) 

N— N  b.i. 

FeMo  +  N2H0 

1.263 

-0.486 

-0.385 

0.999 

FeMo  +  N2H1 

1.367 

-0.544 

-0.216 

0.947 

FeMo  +  N2H2 

1.364 

-0.407 

-0.176 

0.951 

FeMo  +  N2H3 

1.371 

-0.281 

-0.149 

0.969 

FeMo-f  NpH4 

1.370 

-0.298 

0.048 

0.952 
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however,  as  judged  from  the  equilibrium  N — N 
distances  and  bond  indices  shown  in  Table  L  This 
is  not  unexpected  as  the  final  intermediate  exam¬ 
ined  in  this  work,  hydrazine,  is  a  stable  compound 
with  a  relatively  long  N — N  bond  (1.47  A).  The 
intermediate  obtained  on  the  surface  of  the  cofac¬ 
tor  has  a  shorter  N— N  bond  (1.37  A),  most  proba¬ 
bly  due  to  specific  bonding  to  the  metal  surface. 

As  further  seen  from  Figures  3-6,  the 
intermediates  tend  to  form  stable  configurations 
on  the  same  4-Fe  face  of  the  FeMo  cofactor,  and, 
most  importantly,  symmetrically  hydrogenate  the 
N — N  molecule  over  other  possibilities  that  in¬ 
volve  sequential  attachment  of  the  H  atoms  to 
preferentially  one  N  atom.  This  is  a  very  important 
observation  which,  if  proven  true,  may  speak  in 
favor  of  a  dominant  path  for  the  nif  reaction  going 
through  hydrazido  intermediates  rather  than  re¬ 
leasing  the  first  ammonia  at  the  third  protonation 
step.  This  problem  has  been  given  some  attention 
in  the  literature  using  model  systems  [22],  and 
there  are  indications  that  the  addition  of  three 
protons  may  lead  to  the  formation  of  NH-NH2 
intermediates  rather  than  N-NH3.  Our  calcula¬ 
tions,  summarized  in  Figure  7,  support  this  idea, 
showing  increasingly  higher  energy  differences  be¬ 
tween  the  N2Hx  isomers  with  the  increase  of  x. 
Large  barriers  are  expected  to  exist  for  the  release 
of  the  first  ammonia  at  the  third  and  even  fourth 
protonation  steps,  typically  over  200  kcal/mol.  Al¬ 
though  seemingly  high  (ZINDO  is  known  to 
overbind,  often  by  a  factor  of  2-3)  the  calculated 
energy  differences  clearly  indicate  that  hydrazine 
should  be  formed,  provided  the  cluster  and  the 


protein  relaxation  effects  are  assumed  to  be  small. 
The  latter  effects  are  difficult  to  estimate  because 
the  size  of  the  system  makes  optimizing  all  struc¬ 
tures  completely  a  task  almost  impossible,  even 
for  a  semiempirical  method  such  as  ZINDO.  We 
have  tried,  however,  at  least  to  verify  the  energy 
differences  between  the  N2H2  isomers  using  more 
reliable  theories  on  smaller  model  systems.  Table 
II,  for  example,  shows  the  contribution  to  the  total 
energy  difference  between  the  two  N2H2  isomers, 
which  comes  from  the  intermediate  alone.  The 
calculations,  done  at  various  levels  of  theory,  show 
that  the  NH-NH  has  a  considerably  lower  energy 
ground  state  as  that  of  the  the  N-H2  analog  and 
that  the  metal  system  in  fact  contributes  to  de¬ 
creasing  this  difference  to  approximately  -  20 
kcal/mol;  see  Figure  7.  The  same  observation  holds 
obviously  for  the  N2H3  intermediates  where  the 
energy  difference  is  even  larger,  -90  kcal/mol,  in 
favor  of  the  NH-NH  2  form  (Fig.  7).  The  consis¬ 
tency  of  the  results  we  obtained  using  the  much 
faster  ZINDO  model  and  the  more  accurate  DFT- 
ACM  (B3LYP)  and  ab  initio  MP2  models  (Table  II) 
gives  us  some  confidence  in  proceding  with  the 
former. 

We  have  tried  to  further  verify  at  least  the  trend 
for  the  above-mentioned  energetics  using  DFT 
similar  to  that  explored  by  others  [23],  the  BLYP 
functional  and  the  DZVP  basis  set.  We  used  a 
bimetallic  Fe  cluster  (Fig.  8)  to  model  part  of  the 
3-coordinated  Fe  site  and  the  preferential  attach¬ 
ment  of  the  N2Hx  intermediates  to  it.  The  calcula¬ 
tions  have  been  done  using  the  Turbomole  DFT 
program  [13].  Full  geometry  optimizations  were 
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FIGURE  4.  The  most  favored  position  of  the  N2H2  intermediate  on  the  4-Fe  face  of  the  obtained  after  energy 
minimization  of  the  substrate. 
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FIGURE  5.  The  most  favored  position  of  the  N2H3  intermediate  on  the  4-Fe  face  of  the  obtained  after  energy 
minimization  of  the  substrate. 


done  to  include  the  effects  of  the  Fe-S-Fe  relax¬ 
ation.  The  results  obtained  confirm  the  ZINDO 
observations  qualitatively,  though  the  energy  dif¬ 
ferences  are  much  smaller  than  those  obtained 
from  the  semiempirical  calculation:  —6.1  kcal/mol 
for  the  N2H2  isomers  and  -4.6  kcal/mol  in  the 
case  of  N2H3  intermediates  from  the  DFT  calcula¬ 
tions,  compared  to  -39.7  and  —56.2  kcal/mol, 
respectively,  for  the  ZINDO  calculations.  Poor  DFT 
energy  convergence  in  the  case  of  N2H4  interme¬ 
diates  did  not  allow  an  estimation  of  the  differ¬ 
ences  between  them.  The  ZINDO  results  give  a 
preference  for  the  NH2-NH2  over  NH-NH3  by 


-93  kcal/mol  on  the  2-Fe  model  system.  Compar¬ 
ing  the  DFT  and  the  ZINDO  results  on  this 
metal-substrate  model,  we  observe  a  significant 
difference,  7  to  9  times  for  the  first  two  couples  of 
isomers  above,  between  the  calculated  energy  dif¬ 
ferences  (in  absolute  value)  obtained  using  ZINDO 
for  the  same  DFT  optimized  structures  based  on 
the  2-Fe  model  calculations.  It  is  difficult  to  say 
whether  these  differences,  in  the  scale  of  the  ener¬ 
getics,  stem  from  overbinding  in  INDO,  or  under¬ 
binding  in  DFT,  or  both.  Perhaps  more  elaborate 
approaches  can  reveal  the  origin  of  this  effect,  but 
as  far  as  the  trend  is  concerned,  both  methods 


FIGURE  6.  The  most  favored  position  of  the  N2H4  (hydrazine)  intermediate  on  the  4-Fe  face  of  the  obtained  after 
energy  minimization  of  the  substrate. 
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FIGURE  7.  Calculated  energy  differences  between  the  substrate  isomers  as  intermediates  and  the  favored  reaction 
path  (in  bold)  suggesting  the  nif  reaction  going  through  hydrazido  intermediates. 


TABLE  II _ 

Calculated  energy  differences,  AEHNNH_NNH2  (kcal/mol),  between  the  HN— NH  and  N — NH2  fragments  using 
various  methodologies.3 


ZINDO 

MOPAC 

DFT 

ACM 

HF 

MP2 

-39.7 

“33.2 

-36.2 

-35.5 

-23.3 

-48.0 

aACM  stands  for  the  adiabatic  connection  model  (here  B3LYP),  HF  are  all-electron  ab  initio  calculations.  The  nonempirical 
calculations  utilize  DZVP  basis  set.  MOPAC  calculations  were  done  using  the  AMI  Hamiltonian.  The  BLYP  functional  is  used  in  the 
DFT  calculations. 
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FIGURE  8.  The  model  3-coordinated  bimetallic  Fe  site 
used  to  compare  the  DFT  and  INDO  calculations. 


predict  more  stable  hydrazido  over  nitrido  inter¬ 
mediates,  which  is  the  main  target  of  the  present 
study. 

Even  if  we  assume  that  hydrazine  is  formed  as 
a  reaction  intermediate,  several  difficult  questions 
remain  unanswered  at  this  stage  of  our  study: 
Does  the  substrate  stay  and  get  further  reduced  to 
ammonia  at  this  particular  4-Fe  site?  How  mobile 
is  the  N2H  y  system  and  can  it  leave  and  reattach 
to  a  different  metal  site?  How,  indeed,  can  one 
break  even  a  single  N — N  bond  under  these  condi¬ 
tions  given  the  relative  stability  of  the  calculated 
N — N  bonding  parameters  with  respect  to  hydro¬ 
genation?  Are  charged  species,  such  as  NHJ ,  in¬ 
volved  in  the  dissociation  and  to  what  extent  can 
they  contribute  to  it?  What  is  the  role  of  the 
protein  environment  in  the  dissociation  process? 


We  would  like  to  speculate  on  this  process. 

Table  III  gives  the  sum  of  the  calculated  bond 
indices  for  the  four  intermediates  and  the  FeMo 
cofactor.  It  includes  also  the  hydrogen  bonding 
between  the  H  atoms  and  the  sulfurs  present  at 
the  4-Fe  face  where  the  intermediate  stabilization 
occurs.  We  observe  that  the  binding  between  the 
substrate  and  the  FeMo  cofactor  decreases  sharply 
with  the  hydrogenation  process;  this  could  lead  to 
desorption  of  the  hydrazido  intermediates  from 
the  surface  of  the  cofactor,  as  recently  suggested 
by  Coucouvanis  and  co-workers  [15]  examining 
the  reduction  of  hydrazine  on  similar  (mono- 
cubane)  Mo/Fe/S  compounds.  This  idea  is  illus¬ 
trated  in  Figure  9  and  finds  support  through  the 


TABLE  III 


Sums  of  the  calculated  Wiberg  bond  indices  between 
the  N2H2  intermediates  and  the  FeMo  cofactor.8 


System 

2N — FeMo 

XH...S 

Total 

FeMo  +  N2H0 

4.471 

0.000 

4.471 

FeMo  + 

4.018 

0.012 

4.030 

FeMo  +  N2H2 

3.207 

0.033 

3.240 

FeMo  +  N2H3 

1.203 

0.109 

1.312 

FeMo  +  N2H4 

0.655 

0.134 

0.789 

aThe  hydrogen  bonding  indices  between  the  H  atoms  of  the 
intermediates  and  the  S-atoms  of  the  cofactor  are  also 
added  to  form  the  total  bond  index.  The  later  is  used  as  a 
measure  the  holding  force  between  the  FeMo  cluster  and 
the  reaction  intermediates,  see  also  text. 


H+ 


FIGURE  9.  Speculated  migration  of  the  hydrazine  substrate,  alone  or  upon  protonation,  suggested  by  Coucouvanis 
and  co-authors;  see  Ref.  [15]. 
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decrease  of  the  Fe — N  bond  orders  we  observe 
and  the  relatively  small  H-bonding  effects  that 
contribute  to  it.  To  date,  we  are  unable  to  find  a 
more  favorable  site  for  the  hydrazine,  either  at  the 
Mo  atom  or  elsewhere.  As  shown  in  a  previous 
study  [9],  the  Mo  atom  could  open  its  environment 
with  the  addition  of  electrons  to  the  FeMo  cofactor. 
If  this  happens,  structural  rearrangement  of  the 
Mo  coordination  could  be  favored,  either  through 
a  temporary  detachment  of  the  His-442  end  or  the 
carboxyl  O  atom  bonded  to  the  Mo,  both  prone  to 
bond  weakening  upon  reduction  [9].  It  is  also 
possible  that  the  two  carboxyl  groups  present  in 
the  homocitrate  group  are  protonated,  detaching 
from  Mo  coordination,  and  thus  play  a  central  role 
in  the  reduction  [15].  Further  addition  of  electrons 
could  then  evolve  hydrogen  and  regenerate  the 
original  Mo  environment.  In  addition  to  these  con¬ 
siderations,  environmental  effects  may  play  a  cru¬ 
cial  role.  The  latter  can  be  modeled,  to  a  certain 
extent,  using  a  reaction  field  as  a  substitute  for  the 
rest  of  the  protein  system.  We  have  seen  in  a 
previous  study  [8]  that  the  reaction  field  has  no 
significant  effect  on  the  reaction  profile  of  the  N2 
attachment  to  to  FeMo  cofactor,  and  this  is  easy  to 
believe  given  the  fact  that  the  reaction  takes  place 
mostly  in  the  interior  of  the  cofactor  and  there  are 
no  changes  in  net  charge  species  involved  in  it. 
The  formation  and  release  of  ammonia,  however, 
especially  externally  to  the  cofactor,  could  be 
greatly  affected  by  the  protein  surroundings.  Also, 
N — N  bond  breakage  either  may  result  in  neutral 
species,  with  ammonia  as  a  product  of  the  reac¬ 
tion,  or  NH4  might  be  released  instead.  Small 
charged  species  are  especially  stabilized  in  the 
protein  (dielectric)  environments.  The  flow  of  pro¬ 
tons  to  the  FeMo  cofactor  can  greatly  contribute  to 
the  latter  release.  The  net  media  effect  of  the  reac¬ 
tion  leading  to  the  formation  of  ammonium  ion 
can  be  estimated  (Scheme  1)  from  our  reaction 
field  calculations  to  be  as  much  as  36  kcal/mol, 
quite  enough  to  promote  the  breakage  of  the  al¬ 
ready  weakened  . . .  N — N . . .  bond.  The  last  dif¬ 
ference  originates  from  the  magnitude  of  the  two 
leading  electrostatic  terms  in  the  reaction  field 
expressions,  the  charge,  or  Born  term,  and  the 
dipole  moment,  or  Onsager  term  [21].  The  dielec¬ 
tric  constant  e  (set  equal  to  4  in  this  case,  closer  to 
similar  estimates  for  proteins  [24])  has  a  much 
smaller  effect  on  the  energetics  of  neutral  species, 
as  can  be  seen  from  the  two  expressions  in  Scheme 
1.  We  thus  point  out  that  dielectric  stabilization 


(m-NjHj*) - -  ( M-NH  )  +  (NH*) 


-(l-l/e).q2/2a0  agso1v  =  -40.3  kcal/mol 


M-NjHg  - -  M-NftJ  +  (NH3) 


-{(e-l)/(£+0.5)}.|a,2/2ajj  agso|v=  -4.1  kcal/mol 

SCHEME  1.  The  effect  of  dielectric  relaxation  on  the 
reaction  energetics  with  charged  and  neutral  species  as 
reactants  and  products.  A  spherical  cavity  self-consistent 
reaction  field  model  [21]  was  used  to  model  the  solvent 
(e  =  4  for  the  protein).  The  formulas  indicate  the  leading 
term  in  the  electrostatics;  q  is  the  Coulomb  charge  and 
fi  is  the  dipole  moment. 


due  to  the  presence  of  the  protein  surrounding  the 
cofactor  may  contribute  measurably  to  the  N — N 
cleavage  needed  for  the  release  of  ammonia,  in 
this  case  as  an  ion.  We  are  presently  examining  in 
much  greater  detail  the  final  stages  of  the  process. 


Conclusions 

On  the  basis  of  the  present  theoretical  model, 
we  suggest  that  the  biological  nitrogen  fixation, 
believed  to  take  place  at  the  M  cluster  of  nitro- 
genase,  should  preferentially  lead  to  the  forma¬ 
tion  on  hydrazido  intermediates  along  the  NN  -» 
NNH  ->  NHNH  ->  NHNH2  ->  NH2NH2  path. 
We  further  speculate  upon  the  release  of  NH^. 
The  formation  of  the  ammonium  ion  is  suggested 
as  an  aid  to  the  final  N-N  cleavage  due  to  dielec¬ 
tric  stabilization  of  the  small  ion  in  the  protein 
environment. 
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ABSTRACT:  RHF/6-31G*  investigations  of  4-,  5-,  and  6-ethyl(Et)-indole-3-acetic  acid 
(IAA)  yielded  11  symmetry-unique  local  minima  with  syn-pe riplanar  orientation  of  the 
— COOH  group  for  each  of  these  compounds.  The  global  minima  are  of  C1  symmetry  in 
all  cases.  Comparison  with  earlier  results  shows  that  ethylation  or  chlorination  in  position 
5  or  6  introduces  only  minor  changes  on  the  orientation  of  the  acetic  acid  side  group, 
with  no  effect  on  the  reaction  paths  related  to  this  group.  For  4-Et-IAA,  the  deviations 
from  unsubstituted  IAA  are  larger  but  preserve  the  pattern  of  reaction  paths  that  is 
present  in  unsubstituted  IAA,  which  is  in  contrast  to  4-C1-IAA,  where  local  minima  and 
reaction  paths  are  completely  different.  ©  1998  John  Wiley  &  Sons,  Inc.  Int  J  Quant  Chem  70: 
1169-1175,  1998 


Introduction 

Auxin  plant  hormones  govern  many  biological 
processes  in  higher  plants  such  as  cell  divi¬ 
sions  and  enlargement,  developmental  differentia¬ 
tion,  and  the  syntheses  of  specific  proteins.  Among 
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*  Permanent  address :  Instut  Ruder  Boskovic,  HR-10000  Za¬ 
greb,  Croatia. 


this  class  of  compounds,  we  are  specifically  inter¬ 
ested  in  indole  auxins,  the  parent  compound  of 
which  is  indole-3-acetic  acid  (IAA).  IAA  and  its  4- 
and  6-chlorinated  derivatives  are  naturally  occur¬ 
ring  auxins  [1-5].  In  addition,  a  large  number  of 
indole  auxins  have  been  synthesized  and  tested  on 
various  plant  cultures  [1-16].  Several  auxin-bind¬ 
ing  proteins  (ABP)  have  been  distinguished,  and, 
among  them,  ABP1  is  considered  to  be  the  main 
candidate  for  an  auxin  receptor  [17-26].  The  effec¬ 
tiveness  of  auxins  as  growth  promoters  depends 
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not  only  on  their  binding  affinity,  but  also  on 
several  other  factors,  for  example,  lipophilicity  or 
correlation  with  other  compounds  in  plants,  like 
cytokinines,  another  type  of  plant  hormone  [27- 
29].  However,  Reseller  et  al.  [16]  determined  the 
correlation  between  the  binding  affinity  and  the 
maximum  growth  rate  of  meize  coleoptil  section 
at  the  optimum  concentration  of  10“6  mol/L  for 
the  following  compounds:  naphthalene-l-acetic 
acid  (NAA1)  >  4-C1-IAA  >  4-methyl  (Me)-IAA  > 
IAA  >  4-ethyl(Et)-IAA  >  2-Me-IAA.  Regarding 
this,  and  the  other  available  biological  tests  per¬ 
formed  on  IAA  derivatives,  it  seems  that  the  bind¬ 
ing  affinity  is  very  sensitive  to  the  type  and  size  of 
the  indole  ring  substituent  in  position  4. 

Ab  initio  RHF  structure  investigations  have  been 
performed  for  IAA  [30]  and  several  mono-  and 
dichlorinated  derivatives  [31,  32].  These  studies 
yielded  an  interesting  result,  namely,  that  a  chloro 
substituent  at  position  4  changes  the  potential  en¬ 
ergy  surface  (PES)  completely,  whereas  chlorina¬ 
tion  at  positions  5,  6,  and  7  has  only  a  marginal 
effect  upon  reaction  paths  and  potential  barriers. 
Although  the  properties  of  isolated  molecules  can 
be  compared  only  to  a  limited  extent  with  experi¬ 
mental  binding  data,  these  RHF  results  indicate 
possible  binding  conformations  of  indole  auxins. 
The  knowledge  about  the  complex  influence  of 
weak  nonbonded  intramolecular  interactions  on 
the  PES  of  these  compounds  makes  us  aware  of 
similar  influences  of  intermolecular  interactions  on 
the  ligand  conformation  upon  binding.  The  present 
study  is  an  extension  of  these  earlier  RHF  investi¬ 
gations,  scrutinizing  the  influence  of  an  ethyl  sub¬ 
stituent  at  positions  4,  5,  and  6. 


Computational  Details  and  Results 

Local  minima  and  transition  states  were  deter¬ 
mined  via  RHF  optimizations.  Only  con  formers 
with  the  — COOH  group  in  si/n-periplanar  orien¬ 
tation  (i.e.,  values  for  the  torsion  angle  H — O — C- 
=0  ~  0°)  were  considered,  since  the  correspond¬ 
ing  tffth-periplanar  conformers  (H — O — C=0  ~ 
180°)  were  30-40  kj/mol  less  stable  in  previous 
studies.  The  standard  6-31G*  basis  set  was  em¬ 
ployed,  which  was  proven  to  be  adequate  in  the 
case  of  IAA  [30].  The  calculations  were  performed 
with  the  program  GAMESS  [33]  on  a  variety  of 
machines.  All  structures  were  fully  optimized  to  a 
remaining  root  mean-square  (rms)  gradient  less 


than  0.33  X  10-4  Hartree/bohr;  the  nature  of  all 
stationary  points  was  verified  via  computation  of 
the  eigenvalues  of  the  Hessian  matrix:  Local  min¬ 
ima  had  no  negative  eigenvalues  and  transition 
states  had  exactly  one  negative  eigenvalue. 

The  position  of  the  carboxyl  group  relative  to 
the  indole  ring  depends  on  two  torsion  angles 
called  T1  and  T2  in  the  following.  Using  the  atom 
numbering  shown  in  Figure  1,  T1  is  the  torsion 
angle  C2 — C3 — C8 — C9  and  T2  is  the  torsion  an¬ 
gle  C3 — C8 — C9=02.  The  orientation  of  the  ethyl 
group  is  described  by  T3,  which  is  the  torsion 
angle  Cll — CIO— Cn — Cn  +  1  for  w-Et-IAA.  The 
values  of  Tl,  T2,  and  T3  as  well  as  the  energy  of  all 
symmetry-unique  local  minima  (i.e.,  those  with 
Tl  >  0°)  are  collected  in  Tables  I— III. 


Discussion 

The  energies  of  the  various  local  minima  of  5- 
and  6-Et-IAA  follow  a  rather  simple  pattern:  Those 
conformers,  in  which  the  ethyl  group  is  in-plane 
with  the  indole  ring,  are  approximately  5  kj/mol 
higher  in  energy  than  those  with  a  tilted  ethyl 
group.  This  energy  difference  can  clearly  be  re¬ 
lated  to  repulsive  H***H  interactions.  For  an  in¬ 
plane  orientation  of  the  ethyl  group,  there  are  two 
such  interactions  (Cll — H---H — C4  in  5-Et-IAA, 
Cll-H  •••  H — C7  in  6-Et-IAA)  with  H  H  dis¬ 
tances  around  2.4  A  and  two  (CIO — H  H — C6  in 

5- Et-IAA,  CIO — H  H — C5  in  6-Et-IAA)  with 

o 

H  •••  H  distances  around  2.65  A;  if  the  ethyl  group 
is  approximately  perpendicular  to  the  plane  of  the 
indole  ring,  there  is  a  total  of  only  two  such 
interactions  (CIO — H  •  •  •  H — C4  and  CIO — H  •  •  • 
H— C6  in  5-Et-IAA,  CIO— H  —  H— C7  and 
CIO — H  -H — C5  in  6-Et-IAA),  the  distances  of 

o 

which  are  around  2.42  and  2.52  A,  respectively. 

It  is  interesting  to  compare  the  data  of  5-  and 

6- Et-IAA  to  those  of  the  parent  compound  IAA. 


H 


H\  ^C4\ 
HH  XC5^  NC- 


M 

.Cll. 


HH 

V 

xC8n  /Ol 
-C3  C9  H 


H'  *NC10/C6^C7/CnN/C2vH 


02 


A  I  I 

HH  H  H 

FIGURE  1 .  Definition  of  atom  labels,  shown  for 
6-Et-IAA. 
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TABLE  I _ 

Energy  (kJ  /  mol)  and  torsion  angles  (degree)  of  all  symmetry-unique  local  minima  in  the  PES  of  4-Et-IAA. 


Energy 

5.395 

5.893 

6.906 

4.219 

T1  (C9 — C8 — C3 — C2) 

0.00 

103.21 

78.95 

92.21 

T2  (02=C9 — C8 — C3) 

0.00 

2.07 

-96.47 

107.17 

T3  (C11—  CIO— C4— C5) 

0.00 

-2.24 

-4.43 

-6.95 

Energy 

1.476 

4.540 

10.292 

4.886 

T1  (C9 — C8 — C3 — C2) 

8.50 

107.98 

87.14 

109.79 

T2  (02= C9 — C8 — C3) 

-2.90 

-20.77 

-92.25 

133.72 

T3  (C11—  CIO— C4— C5) 

-90.31 

-78.11 

-70.13 

-79.60 

Energy 

2.876 

2.987 

0.000 

T1  (C9 — C8 — C3 — C2) 

102.46 

81.41 

94.87 

T2  (02= C9 — C8 — C3) 

0.19 

-96.68 

110.83 

T3  (C11—  CIO— C4— C5) 

91.50 

94.26 

92.03 

Zero  energy  corresponds  to  an  absolute  value  of  -666.1894024  Hartrees. 


The  6-31G*-PES  of  IAA  contains  four  symmetry- 
unique  local  minima  [30],  with  the  following  val¬ 
ues  of  T1  and  T2:  0°/0°  (Erel  =  0),  112.51°/103.57° 
(Erel  =  0.50  kj/mol),  99.06°/-  96.41°  (Erel  -  2.00 
kj/mol),  and  11L85°/1.60°  (Erel  =  4.36  kj/mol). 
The  T1/T2  values  of  5-  and  6-Et-IAA  therefore 
deviate  less  than  3°  from  those  of  unsubstituted 
IAA.  A  similar  correspondence  can  be  observed  for 
the  energies:  For  conformers  with  a  tilted  ethyl 
group,  the  maximum  deviation  from  the  IAA  en¬ 
ergy  pattern  is  0.31  kj/mol  (for  6-Et-IAA  with 
T1/T2  =  112.95°/102.49°),  and  for  those  with  the 
in-plane  ethyl  group,  the  maximum  difference  is 
0.55  kj/mol.  The  latter  deviation  occurs  for  the 
5-Et-IAA  conformer  with  T1/T2/T3  = 


112.50o/101.42o/179.58°  and  is  remarkable  because 
only  in  this  case  is  the  energy  lower  than  that  of 
the  conformer  with  the  same  orientation  of  the 
ethyl  group  and  T1  =  T2  =  0°.  This  deviation  can 
be  explained  by  the  weak  electrostatic  C9=02  •  •  •  H 
— C4  interaction,  which  occurs  in  all  5-  and  6- 
Et-IAA  conformers  with  T1  ~  T2  ~  100°.  The 
O  •  •  •  H  distances  of  this  interaction  are  around  2.9 

o 

A.  In  the  specific  5-Et-IAA  case,  it  reduces  the  net 
charge  of  the  hydrogen  atom  on  C4  just  enough  to 
weaken  the  repulsive  H  *  •  *  H  interactions,  which 
were  discussed  above.  As  a  result,  the  increase  in 
energy  for  this  specific  conformer  is  less  than  that 
of  all  others  with  an  in-plane  ethyl  group,  which 
results  in  the  interchange  in  energy.  For  the  corre- 


TABLE  II _ 

Energy  (kJ  /  mol)  and  torsion  angles  (degree)  of  all  symmetry-unique  local  minima  in  the  PES  of  5-Et-IAA. 


Energy 

5.426 

9.461 

7.141 

5.376 

T1  (C9 — C8 — C3 — C2) 

0.00 

112.13 

98.34 

112.50 

T2  (02=C9 — C8 — C3) 

0.00 

2.67 

-98.94 

101.42 

T3  (C11—  CIO— C5— C6) 

180.00 

-179.70 

-179.79 

179.58 

Energy 

0.000 

4.104 

1.902 

0.207 

T1  (C9 — C8 — C3 — C2) 

0.01 

112.02 

98.36 

112.46 

T2  (02 = C9 — C8 — C3) 

0.04 

2.54 

-99.06 

101.62 

T3  (C11— CIO— C5— C6) 

-81.34 

-81.09 

-81.05 

-81.47 

Energy 

4.274 

1.918 

0.247 

T1  (C9 — C8 — C3 — C2) 

1 1 1 .66 

98.42 

112.58 

T2  (02 = C9 — C8 — C3) 

1.40 

-98.85 

102.04 

T3  (Cl  1  —Cl  0— C5— C6) 

81.31 

81.20 

81.51 

Zero  energy  corresponds  to  an  absolute  value  of  -666.1926655  Hartrees. 
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TABLE  III _ 


Energy  (kJ  /  mol)  and  torsion  angles  (degree)  of  all  symmetry-unique  local  minima  in  the  PES  of  6-Et-IAA. 


Energy 

5.148 

9.175 

6.691 

5.216 

T1  (C9 — C8 — C3 — C2) 

0.00 

113.18 

99.53 

113.02 

T2  (02=C9 — C8 — C3) 

0.00 

4.26 

-99.19 

101.02 

T3(C11— CIO— C6— C7) 

0.00 

0.20 

0.14 

0.01 

Energy 

0.000 

4.129 

1.701 

0.194 

T1  (C9 — C8 — C3 — C2) 

0.20 

112.63 

98.89 

112.95 

T2  (02=C9 — C8 — C3) 

-0.07 

3.35 

-99.75 

102.49 

T3  (C11 — CIO — C6 — C7) 

-97.22 

-97.94 

-98.20 

-98.14 

Energy 

4.161 

1.708 

0.190 

T1  (C9 — C8 — C3 — C2) 

112.97 

98.83 

112.56 

T2  (02=C9 — C8 — C3) 

3.98 

-98.91 

101.76 

T3  (Cl  1 —  CIO— C6— C7) 

98.11 

98.62 

98.33 

Zero  energy  corresponds  to  an  absolute  value  of  -666.1932497  Hartrees. 


sponding  6-Et-IAA  conformers,  in  which  the  ori¬ 
entation  of  the  ethyl  group  is  toward  C 7  instead  of 
C4,  the  sequence  of  relative  energies  is  identical  to 
that  of  IAA. 

A  notable  difference  between  5-  and  6-Et-IAA, 
on  the  one  hand,  and  unsubstituted  IAA,  on  the 
other  hand,  is  that  the  global  minima  in  the  former 
are  not  mirror-symmetrical,  which  also  is  a  conse¬ 
quence  of  the  increased  H  •••  H  repulsion  in  the  Cs 
orientation.  The  acetic  acid  side  chain,  however,  is 
coplanar  with  the  indole  ring  in  the  global  minima 
of  5-  and  6-Et-IAA,  as  it  is  in  IAA.  This  arrange¬ 
ment  results  in  a  weak  C9=02  H — C2  hydrogen 
bond  with  a  bond  order  [34]  of  0.016  and  0---H 
distances  of  2.391  A  (5-Et-IAA),  2.394  A  (6- 
Et-IAA),  and  2.391  A  (IAA).  This  hydrogen  bond 
also  occurs  in  the  mirror-symmetrical  conformer  of 
5-Et-IAA  and  6-Et-IAA,  with  the  same  bond  or¬ 
der  of  0.016  and  0---H  distances  of  2.392  and 
2.396  A,  respectively. 

The  similarity  between  the  PES  of  IAA  and 
those  of  5-  and  6-Et-IAA  is  not  limited  to  the 
position  of  the  local  minima:  It  also  extends  to  the 
T1  /T2  reaction  paths.  Figure  2  compares  the  posi¬ 
tions  of  all  local  minima  and  saddle  points  of  IAA 
and  5-Et-IAA,  and  Figure  3  does  the  same  for  IAA 
and  6-Et-IAA.  Despite  the  energy  shift  of  approxi¬ 
mately  5  kj/mol  for  the  conformers  with  an  in¬ 
plane  orientation  of  the  ethyl  group,  the  energy 
barriers  for  the  internal  rotations  (with  constant 
orientation  of  the  ethyl  group)  are  almost  identical 
in  all  cases.  Figure  4  shows  this  for  the  internal 
rotations  of  T2  (with  T1  ~  100°)  in  IAA  and  5- 
Et-IAA. 


For  4-Et-IAA,  the  situation  is  significantly  dif¬ 
ferent,  because  of  a  variety  of  intramolecular  inter¬ 
actions.  One  is  the  C9=02-*-H — C2  hydrogen 
bond,  which  is  also  present  in  5-  and  6-Et-IAA.  It 
occurs  for  the  conformers  with  T1  *  T2  ~  0°  and 
is  slightly  stronger  in  4-Et-IAA,  with  a  bond  order 
of  0.018  (both  forms)  and  O  H  distances  of  2.314 
A  (Cs  form)  and  2.323  A  (C3  form).  Another  slightly 


FIGURE  2.  Positions  of  all  symmetry-unique  stationary 
points  in  the  PES  of  IAA  and  5-Et-IAA  that  relate  to 
internal  rotations  of  the  acetic  acid  side  chain.  The  solid 
lines  indicate  the  reaction  paths  in  the  PES  of 
unsubstituted  IAA:  (x)  IAA,  local  minima;  (®)  IAA,  saddle 
points;  (+)  5-Et-MA,  local  minima;  (©)  5-Et-MA,  saddle 
points. 
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FIGURE  3.  Positions  of  all  symmetry-unique  stationary 
points  in  the  PES  of  IAA  and  6-Et-IAA  that  relate  to 
internal  rotations  of  the  acetic  acid  side  chain.  The  solid 
lines  indicate  the  reaction  paths  in  the  PES  of 
unsubstituted  IAA:  (x)  IAA,  local  minima;  (<s>)  IAA,  saddle 
points;  (+)  6-EMA,  local  minima;  (©)  6-Et-IAA,  saddle 
points. 


FIGURE  4.  Energy  of  the  stationary  points  of  IAA  and 
5-Et-IAA  with  T1  ~  100°  along  the  internal  rotation  of  T2; 
zero  energy  corresponds  to  the  conformer  with  T2  * 
100°  in  all  cases.  (•)  IAA;  (*)  5-Et-IAA,  T3  «  180°;  (©) 
5-Et-IAA,  T3  *  -80°;  (®)  5-Et-IAA,  T3  *  80°. 


stronger  hydrogen  bond,  CIO — H--02=C9,  oc¬ 
curs  in  the  conformers  with  T1/T2/T3  =  92.21°/ 
107.17°/  -  6.95°  (H  O  distance:  2.560  A,  bond 
order:  0.019)  o  and  T1/T2/T3  =  94.87°/110.83°/ 
92.03°  (2.552  A,  0.021).  Because  of  the  stabilizing 
effect  of  this  hydrogen  bond,  the  latter  conforma¬ 
tion,  characterized  by  both  side  chains  more  or  less 
perpendicular  to  the  indole  ring  plane  and  point¬ 
ing  toward  opposite  sides  of  this  plane,  is  the 
global  minimum  in  the  PES.  Other  distances  of 
interest  in  this  structure  are  those  between  Ol  and 

o 

the  hydrogen  atom  at  position  2,  which  is  3.358  A, 
and  CIO — H  •••  H — C8,  which  is  2.323  A,  A  weaker 
form  of  the  CIO — H---02-C9  hydrogen  bond  is 
also  present  in  the  conformer  with  T1/T2/T3  = 
102.46°/0.19°/9o1.50°;  the  H-O  distance  in  this 
case  is  2.750  A  and  the  bond  order  0.011.  Yet 
another  weak  hydrogen  bond,  CIO — H  •••  Ol — C9, 
occurs  in  the  conformers  with  T1/T2/T3  = 
78.95°/  -  96.47°/  -  4.43°  (H  -  O  distance:  2.607  A, 
bond  order:  0.011)  and  T1/T2/T3  =  98.42°/ 
-98.85°/81.20°  (2.585  A,  0.012). 

Similar  to  5-  and  6-Et-IAA,  repulsive  H**H 
interactions  are  present  in  all  4-Et-IAA  conform¬ 
ers.  In  contrast,  however,  not  all  of  them  are  be¬ 
tween  aromatic  and  aliphatic  hydrogen  atoms.  In¬ 
stead,  some  H  •  •  *  H  interactions  occur  between  the 
two  side  chains;  in  the  mirror-symmetrical  con¬ 
former  of  4-Et-IAA,  for  example,  the  hydrogen 
atoms  of  both  methylene  groups  are  pointing  di¬ 
rectly  toward  each  other  (with  two  H  •  •  •  H  dis¬ 
tances  of  2.320  A).  In  the  local  minimum  of  highest 
relative  energy  (Erel  =  10.292  kj/mol),  the 
CIO — H**H — C8  distance  is  as  low  as  2.015  A. 
(For  this  specific  local  minimum,  an  increase  of  T2 
immediately  leads  to  a  saddle  point  at  T1  /T2  = 
87.96°/  -  84.90°,  which  is  only  0.003  kj /mol  higher 
in  energy.  The  harmonic,  unsealed  vibration  fre¬ 
quencies,  which  correspond  to  that  reaction  path, 
are  13.83  i  and  19.19  cm-1  for  the  saddle  point  and 
the  minimum,  respectively.  The  latter  value  is 
equivalent  to  a  zero-point  energy  of  0.115  kj/mol, 
which  means  that  this  local  minimum  is  just  a 
mathematical  feature  of  the  PES,  but  does  not 
produce  a  stable  conformer.) 

The  considerable  steric  strain,  which  is  caused 
by  these  short  H  •  •  •  H  distances,  is  reflected  by  the 
absolute  energies  of  the  global  minima:  5-Et-IAA 
is  about  8.6  kj/mol  and  6-Et-IAA  about  10.1 
kj/mol  lower  in  energy  than  is  4-Et-IAA.  It  also 
affects  the  positions  of  the  local  minima  and  the 
saddle  points  in  the  T1/T2  space.  In  contrast  to  5- 
and  6-Et-IAA,  these  positions  vary  significantly 
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FIGURE  5.  Positions  of  all  symmetry-unique  stationary 
points  in  the  PES  of  IM  and  4-Et-IAA  that  relate  to 
internal  rotations  of  the  acetic  acid  side  chain.  The  solid 
lines  indicate  the  reaction  paths  in  the  PES  of 
unsubstituted  IAA:  (x)  IAA,  local  minima;  (®)  IAA,  saddle 
points;  (+)  4-Et-IAA,  local  minima;  (©)  4-Et-IAA,  saddle 
points. 


with  the  orientation  of  the  ethyl  group  and  deviate 
up  to  30°  from  those  of  the  parent  compound  IAA. 
Figure  5  shows  the  T1/T2  positions  of  all  local 
minima  and  saddle  points  of  4-Et-IAA  in  compari¬ 
son  to  those  of  IAA.  Despite  these  deviations,  the 
pattern  of  T1/T2  reaction  paths  in  the  PES  of 
unsubstituted  IAA  is  still  recognizable  in  that  of 
4-Et-IAA.  This  is  a  remarkable  result  in  view  of 
the  data  for  4-C1-IAA  [31],  where  the  symmetry- 
unique  local  minima  are  at  T1/T2  positions  of 
0°/0°,  105.45°/“  14.01°,  110.92°/162.94°,  and 
6.25°/114.29°  and  the  pattern  of  reaction  paths  is 
completely  different  (e.g.,  there  is  no  internal  rota¬ 
tion  of  T2  with  T1  «  100°). 


Summary  and  Conclusion 

The  PES  of  4-,  5-,  and  6-Et-IAA  were  inves¬ 
tigated  via  ab  initio  RHF/6-31G*  calculations. 
For  each  compound,  11  symmetry-unique  local 
minima  with  syn-periplanar  orientation  of  the 
— COOH  group  are  present  in  the  PES.  In  contrast 
to  IAA  and  its  chlorinated  derivatives,  the  global 
minima  of  5-  and  6-Et-IAA  are  not  mirror-sym¬ 
metrical  but  characterized  by  the  acetic  acid  side 


chain  coplanar  with  the  indole  ring  and  the  ethyl 
group  almost  perpendicular  to  this  plane.  In  4- 
Et-IAA,  a  weak  hydrogen  bond  between  the  two 
side  chains  yields  a  geometry  for  the  global  mini¬ 
mum,  in  which  both  side  chains  point  toward 
opposite  sides  of  the  indole  ring  plane.  In  all  three 
cases,  the  PES,  therefore,  has  two  global  minima, 
which  are  "degenerate"  in  the  terminology  of 
quantum  mechanics. 

Comparison  with  the  results  obtained  earlier  for 
unsubstituted  IAA  [30]  shows  that  ethylation  in 
position  5  or  6  introduces  only  minor  changes  of 
the  PES,  which  do  not  affect  the  reaction  paths 
related  to  the  acetic  acid  side  chain.  The  same  has 
been  found  for  5-  and  6-C1-IAA  [32].  In  the  case  of 
4-Et-IAA,  the  deviations  from  unsubstituted  IAA 
are  much  larger,  but  despite  these  deviations,  the 
pattern  of  T1  /T2  reaction  paths  of  the  IAA  PES  is 
also  present  in  that  of  4-Et-IAA.  This  is  a  remark¬ 
able  contrast  to  the  situation  in  4-C1-IAA  [31], 
where  some  local  minima  appear  at  significantly 
different  positions  in  T1/T2  space,  and  the  reac¬ 
tion  paths  are  completely  different.  This  compari¬ 
son  shows  that  the  different  PES  of  4-C1-IAA  is  an 
effect  that  is  specifically  related  to  the  chloro  sub¬ 
stituent  at  position  4.  Interestingly,  the  qualitative 
picture,  which  one  obtains  from  the  PES  of  4- 
Cl-IAA,  IAA,  and  4-Et-IAA,  is  well  in  accord 
with  the  measured  biological  data. 

The  results  of  this  study  also  show  that  rela¬ 
tively  weak  intramolecular  interactions  can  signifi¬ 
cantly  influence  the  orientation  of  the  acetic  acid 
side  chain  in  IAA  derivatives.  The  same  can  be 
expected  from  the  intermolecular  interactions  that 
enable  the  binding  to  any  auxin  receptor.  Hence, 
the  combined  results  of  the  current  work  and  the 
previous  studies  of  indole  auxins  present  a  basis 
for  the  investigation  of  the  actual  binding  process. 
The  acetic  acid  side  chain,  as  well  as  any  nonrigid 
substituent,  can  be  expected  to  show  a  significant 
amount  of  flexibility,  that  is,  it  can  easily  adopt 
different  orientations.  Therefore,  in  the  auxin-re¬ 
ceptor  complex,  considerable  deviations  from  the 
structures  of  the  isolated  compound  or  the  respec¬ 
tive  solid-state  structures  can  be  anticipated. 
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ABSTRACT:  The  energetics  of  formation  of  a  triple-helical  structure  in 
homopurine-homopyrimidine  mixtures  has  been  modeled  using  Poisson-Boltzmann 
calculations.  Oligomers  with  the  sequence  d( TC)„  and  d(AG)n  form  hydrogen-bonded 
triple-helical  structures  of  the  form  d(TC)n  •  d(AG)n  •  d( TC+)„.  The  third  base,  a 
pyrimidine  in  this  case,  forms  Hoogsteen-type  hydrogen  bonds  with  the  purine,  requiring 
that  the  cytosine  residues  of  the  third  strand  protonate  at  N3.  The  p  Ka  of  cytosine,  4.3  in 
the  isolated  solvated  molecule,  is  raised  by  the  strong  electrostatic  field  in  the  triple  helix. 
We  have  done  calculations  of  the  effective  p  Ka  of  this  cytosine  and  compared  the  results 
with  experimental  studies  of  triple-helix  formation  as  a  function  of  pH.  This  provides  a 
test  of  various  models  of  the  dielectric  constant  for  triplex  DNA  and  its  local  environment. 
©  1998  John  Wiley  &  Sons,  Inc.  Int  J  Quant  Chem  70:  1177-1184,  1998 

Key  words:  p  Ka  shift;  acid  dissociation  constant;  triple  helix;  dielectric  constant 


Introduction 

Oligonucleotide  hybridization  is  sufficiently 
robust  to  include  formation  of  triple-helical 
constructs  besides  the  more  familiar  double  he¬ 
lices.  Hybridization  to  form  duplex  structures  is 
highly  sequence  specific.  The  high  degree  of  com¬ 
plementarity  of  G  with  C  and  of  A  with  T  is 
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essential  for  gene  function.  In  triplex  formation  the 
third  strand  associates  with  an  existing  duplex 
through  hydrogen  bonding  of  the  third-strand 
bases  with  the  Watson-Crick  duplex  base  pairs 
with  significant  specificity.  The  most  prevalent 
type  of  triple  helix  is  formed  by  binding  of  a 
homopurine  or  homopyrimidine  single  strand  in 
the  major  groove  of  a  homopurine-homopyrimi¬ 
dine  duplex  with  the  two  pyrimidine  strands  an¬ 
tiparallel  [1].  When  a  G-C  base  pair  is  recognized 
by  a  C  on  the  third  strand,  the  interaction  is  pH 
dependent,  increasing  with  greater  acidity.  Arnott 
et  al.  [2]  showed  that  protonation  of  the  cytosine 
N3  allows  the  formation  of  Hoogsteen  hydrogen 
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bonding  to  G  as  shown  in  Figure  1,  providing  a 
rationale  for  the  pH  dependence. 

The  pKn  of  deoxycytidine  is  4.3  in  aqueous 
solution  and  varies  by  about  0.1  as  the  ionic 
strength  goes  from  0  to  1  M  [3].  Measurements  of 
the  pH  dependence  of  the  formation  of  triplexes  in 
which  the  third  strand  is  rf(CTTCCTCCTCT)  show 
the  midpoint  of  the  association  to  occur  at  pH  5.8 
[4].  A  similar  analysis  of  triple-helix  formation 
in  hairpin  regions  with  the  third  strand  being 
d(TTCTTCTTC)  or  d(CCTCCTCCT)  yielded  mid¬ 
points  at  6.15  and  6.19,  respectively  [5].  Callahan 
and  co-workers  [6]  found  a  triplex-formation  mid¬ 
point  at  pH  5.6  for  d(CT)s-d(AG)s-d(CT)s,  while 
Singleton  and  Dervan  [7]  found  a  midpoint  in  the 
association  constant  curve  for  a  rf(CT)5-containing 
oligonucleotide  to  occur  at  pH  5.5.  Lavelle  and 
Fresco  [8]  provided  evidence  that  the  midpoint  of 
the  triplex-association  constant  versus  pH  curve 
was  a  measure  of  the  cytosine  p  Ka  by  showing 
that  on  dissociation  of  the  triplex  at  pH  7,  one  H  + 
per  cytosine  residue  is  released  into  solution.  The 
experimental  evidence  seems  to  indicate  that  the 
third-strand  cytosine  p  Ka  is  increased  by  about  1.5 
units  when  it  is  bound  in  the  major  groove  as  part 
of  triplex  DNA. 

Calculations  of  the  p  Kn  of  titra table  groups  in 
proteins  using  numerical  solutions  to  the  Pois- 
son-Boltzmann  equation  have  met  with  varying 
degrees  of  success.  Tanford  and  Kirkwood  [9]  de¬ 
veloped  a  theory  of  protein  titration  curves  based 
on  a  model  of  a  low-dielectric  spherical  protein 
with  discrete  unit  charges  at  fixed  locations  all 
embedded  in  a  high-dielectric  (aqueous)  contin¬ 


uum.  When  numerical  methods  for  solving  electro¬ 
static  equations  based  on  higher  resolution  protein 
structures  became  available,  more  accurate  calcu¬ 
lations  of  the  pK/s  of  buried  groups  soon  fol¬ 
lowed.  Using  Poisson's  equation,  Rogers  et  al.  [10] 
relied  on  the  method  of  Warwicker  and  Watson 
[11]  to  calculate  the  change  in  potential  at  one  site 
due  to  protonation  at  another.  Sternberg  et  al.  [12] 
used  this  same  procedure  to  predict  p Ka  shifts  in 
subtilisin  caused  by  mutation  of  charged  residues. 
Coupling  Poisson's  equation  with  the  Boltzmann 
equation  leads  to  a  Poisson-Boltzmann  (PB)  de¬ 
scription  of  the  electrolyte  environment  of  DNA. 
Using  a  dielectric  constant  of  2  or  4  for  the  protein 
interior  and  80  for  the  environment,  the  pK/s  of 
several  proteins  have  been  calculated  (see,  e.g., 
[13-16]). 

Representing  the  anisotropic  atomic  environ¬ 
ment  by  a  single  dielectric  constant  can  be  a  seri¬ 
ous  approximation.  Warshel  [17]  and  Warshel  and 
Aqvist  [18]  have  pointed  out  that  its  value  de¬ 
pends  on  the  property  under  consideration  and 
can  vary  from  4  to  greater  than  40.  Nevertheless, 
the  computational  convenience  of  the  PB  approach 
has  prompted  several  groups  to  seek  optimal  rep¬ 
resentations  of  a  single  dielectric  constant  for  the 
protein  interior.  Demchuk  and  Wade  [19]  identi¬ 
fied  two  location-dependent  classes  of  ionizable 
sites.  To  get  the  best  agreement  with  experimen¬ 
tally  determined  pK/s,  solvent  exposed  sites  were 
assigned  a  dielectric  constant  close  to  that  of  the 
aqueous  solvent  while  buried  sites  had  lower  val¬ 
ues  between  10  and  20.  In  a  study  of  60  sites 
within  7  proteins,  Antosiewicz  [20]  found  that  the 
best  accuracy  could  be  obtained  with  a  interior 
protein  dielectric  constant  of  20.  This  rather  high 
value  has  recently  been  used  by  Schaefer  et  al.  [21] 
in  an  application  of  the  PB  approach  to  the  calcula¬ 
tion  of  free  energy  differences  between  protein 
conformations.  Antosiewicz  et  al.  [20]  compared 
computed  and  experimental  p K(}  shifts  at  63  sites 
using  a  parametrized  set  of  atomic  charges  and 
radii,  PARSE,  which  was  specifically  optimized  to 
reproduce  measured  solvation  energies  of  small 
molecules.  They  found  that  PB-calculated  p  K(l 
shifts  averaged  over  a  set  of  nuclear  magnetic 
resonance  (NMR)  determined  protein  conforma¬ 
tions  could  be  more  accurate  than  the  null  model, 
in  which  all  p  Kn  shifts  of  a  titratable  group  are  the 
same  for  all  members  of  that  group  whatever  their 
location  within  the  protein.  On  the  other  hand, 
Antosiewicz  et  al.  [20]  concluded  that  even  when 
the  extra  computational  effort  was  made  "the 
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pKa's  calculated  using  a  protein  dielectric  constant 
of  4  are  less  accurate  than  those  computed  with  a 
less  plausible  protein  dielectric  constant  of  20." 
Relatively  little  effort  has  been  devoted  to  deter- 
mining  internal  dielectric  constants  for  nucleic 
acids.  Yang  et  al.  [22]  analyzed  a  long  molecular 
dynamics  simulation  of  triplex  DNA  and  found 
the  general  dielectric  constant  of  DNA  to  be  about 
15  with  the  subgroups  of  bases,  sugar  atoms,  and 
phosphates  to  be  4.2,  2.3,  and  48.5,  respectively. 
Lamm  and  Pack  [23]  calculated  the  dielectric  con¬ 
stant  of  the  ionic  environment  near  the  surface  of 
the  B-DNA  and  found  it  to  be  about  30  in  the 
minor  groove  and  50  in  the  major  groove,  agreeing 
well  with  available  experiments. 

In  the  presence  of  the  strong  electrostatic  poten¬ 
tial  at  the  DNA  surface,  the  intrinsic  p  Ka  of  cyto¬ 
sine  would  be  expected  to  increase  due  to  the 
higher  local  H+  concentration  [24].  In  fact,  a  sim¬ 
ple  PB  cell  model  calculation  [25]  predicts  an  in¬ 
crease  in  p  Ka  of  about  2.2  units  for  duplex  DNA. 
This  agrees  well  with  measurements  of  the  appar¬ 
ent  pK/s  of  amino  acids  that  were  covalently 
bound  to  the  minor  groove  of  duplex  DNA,  which 
show  an  increase  of  1.5  to  2.4  units  over  the  pKa's 
in  aqueous  solution  [26].  For  triplex  DNA,  a  simi¬ 
lar  PB  cell  model  calculation  yields  a  somewhat 
larger  intrinsic  p  Ka  change  of  almost  3  units  due 
to  the  increased  surface  charge  density  in  the 
model.  PB  calculations  on  a  more  detailed,  all-atom 
model  of  triplex-helical  DNA  predict  a  p  Ka  change 
of  over  5  units.  These  relatively  simple  calculations 
are  qualitatively  correct  but  exaggerate  the  experi¬ 
mentally  determined  p  Ka  change.  This  study  de¬ 
scribes  initial  attempts  to  apply  the  more  detailed 
protein-based  methods  described  above  to  calcu¬ 
late  the  p  Ka  of  the  cytosine  of  the  third  strand  of 
triplex  DNA.  Results  and  conclusions  follow  a 
brief  discussion  of  the  methods  used. 


Methods 

The  system  chosen  was  an  infinite  repeat  of  the 
homopyrimidine-homopurine-homopyrimidine 
triplex  in  which  the  third  strand  is  protonated. 
The  sequence  for  the  parent  duplex  was  poly 
rf(A-G)~poly  d( T-C);  the  third  strand  was  poly 
rf(T-C).  Numerical  Poisson-Boltzmann  calcula¬ 
tions  were  performed  using  methods  previously 
described  [25].  Briefly,  the  space  occupied  by  the 
DNA  and  its  environment  was  divided  into  many 


finite  volume  cells  in  planar  cross  sections  perpen¬ 
dicular  to  the  DNA  helical  axis.  The  finite  dif¬ 
ference  version  of  the  Poisson  equation  for  the 
electrostatic  potential  at  cell  location  i  on  a 
curvilinear  grid  is 

TVjPj/ej  +  1Lj(<t>jeijSij/rij) 

where  v{  is  the  volume  of  cell  i,  p{  is  the  total 
charge  density,  =  (ez  +  e})/ 2  is  the  arithmetic 
average  of  the  dielectric  constants  of  cells  i  and  j, 
Sl}  is  the  shared  surface  area  of  these  cells,  and  rX] 
is  the  distance  between  cell  centers.  Summations 
are  taken  over  all  cells  j  sharing  a  surface  with  a 
given  cell  i.  Inside  the  DNA  the  charge  within 
each  cell  was  fixed  as  calculated  from  the  overlap 
with  the  atomic  van  der  Waals  spheres  and  pro¬ 
vided  a  charge  density  px  =  c\i/vl.  The  total  charge 
density  in  each  cell  in  the  environment  was  ob¬ 
tained  from  the  individual  ionic  charge  densities 
by  summation:  p{  =  E*  pf.  The  Boltzmann  equa¬ 
tion  for  each  ion  type  can  be  written  as 

k  =  Nfcexp(-/3zt<ft,) 

Pl  I>,exp(-/3z^,) ' 

in  which  zk  is  the  ion  valence  and  Nk  is  the  total 
number  of  ions  of  type  k  within  the  system.  A 
finite  stretch  of  DNA  was  chosen  as  a  repeat  unit 
and  the  intercell  links  and  shared  surface  areas 
between  the  cells  in  the  planar  cross  sections  at 
either  end  of  the  repeat  were  calculated,  effectively 
wrapping  the  finite  element  grid  around  to  achieve 
an  infinite  repeat  of  linear  DNA. 

The  thermodynamic  cycle  illustrated  in  Figure  2 
was  used  to  define  the  free  energy  of  protonation 


model 

Cyt  H+(sol)  — y 

1 

Cyt(sol)  + 

1 

H+(sol) 

DNA 

Cyt  H+(DNA)  —4 

Cyt(DNA)  + 

H+(sol) 

FIGURE  2.  The  thermodynamic  cycle  used  in  the 
definition  of  the  pKa  difference  between  the  model 
compound,  deoxycytidine,  in  water  (top  arrow)  and 
deoxycytidine  on  the  third  strand  of  triple-helical  DNA 
(bottom  horizontal  arrow).  The  top  horizontal  process  is 
an  experimental  quantity  that— combined  with  the  two 
processes  represented  by  the  vertical  arrows— defines 
the  pKa  of  the  bottom  horizontal  process,  the  protonation 
of  deoxycytidine  on  the  third  strand  of  triple-helical  DNA. 
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of  a  cytosine  residue  in  triplex  DNA  (bottom  ar¬ 
row)  in  terms  of  the  known  p  Ka  of  cytosine  in 
solution  (top  line)  and  the  energies  required  to 
transport  the  charged  (left  vertical  arrow)  and  un¬ 


charged  (right  vertical  arrow)  cytosine  from  solu¬ 
tion  into  the  DNA  triplex.  Following  Bashford  and 
Karplus  [14],  the  expression  for  the  fraction  0,  of 
cytosine  i  protonated  is 


^x)x,exp[EMxM(2.303(pKinlr^  -  pH))  -  jE^(  ,({ X}))] 

E(X)exp[EMxM(2.303(pKintr,M  -  pH))  -  ^(^W^X}))] 


in  which  {X}  is  a  set  of  “protonation  state"  vec¬ 
tors,  each  of  which  has  n  elements  that  are 
either  1  or  0,  depending  whether  site  fi  is  pro¬ 
tonated  or  not.  There  are  2”  members  of  {X} 
corresponding  to  each  possible  protonation  state 
shown  above.  As  discussed  below,  the  fact  that  we 
are  using  an  infinitely  long  model  for  DNA  intro¬ 
duces  some  difficulties  into  the  definition  of  all 
possible  protonation  states. 

The  intrinsic  p  Ka,  determined  by  neglecting  in¬ 
teractions  between  protonated  sites,  is  given  by 
Eq.  (2), 

P^intr  model 

—  [  AAGBorn  -  AAGback]/(2.303  kT),  (2) 

in  which  the  following  quantities  are  calculated 
from  the  PB-determined  potentials  and  charges: 

AAGBorn  =  \  NA,i  "  ^model,  J 

i 

-iEQrl^DNA^-Codel,/]  (3) 

i 

and 

^AGback  —  E  Cjj[  ^SnA,  j  ~~  ^model,/] 

I 

~  Efyt^DNA,/  “  ^model/l*  (4) 
j 

AAGBorn  is  the  difference  in  the  Born  free  energy 
between  charging  site  i  in  the  model  (solvated 
cytosine)  and  in  DNA.  Qf  and  Q ■  are  the  charges 
at  the  titrating  sites  when  protonated  and  unproto- 
nated,  respectively,  <A£na,/  and  </>dna,/  similarly 
represent  the  calculated  electrostatic  potentials  at 
site  i  when  the  site  is  protonated  and  unproto- 
nated,  and  ^odel/  and  Codei,/  are  the  corre~ 


sponding  potentials  calculated  in  the  model  com¬ 
pound.  AAGback  is  the  interaction  of  the  titrating 
sites  with  the  nontitrating  charges  c\y  The  electro¬ 
static  repulsion  between  simultaneously  proto¬ 
nated  sites  is  given  by  Eq.  (5): 

EIQ',  -  Q;,J[*6 na,/..-  *6na,/.  J- 

(5) 

flM  v  is  the  interaction  of  the  titrating  sites  of  DNA 
with  each  other  and  represents  the  fact  that  site  fi 
is  more  difficult  to  protonate  if  site  v  is  already 
protonated.  The  pH  at  which  6i  =  0.5  is  defined  as 
the  p  Ka  of  the  site. 

The  finite  repeat  unit  chosen  for  these  calcula¬ 
tions  was  the  d(A-G)6-d(T-C)6-d(T-C)6  dode- 
camer.  The  geometry  of  the  triplex  was  generated 
from  the  x-ray  diffraction  structure  of  poly(dT)- 
poly(rfA)-poly(rfT)  [27]  by  replacing  alternate 
T-A-T  triads  with  C-G-C+  triads.  Six  equivalent 
sites  of  protonation,  the  N3  of  the  cytosines  of  the 
third  strand,  are  available  within  this  repeat  unit 
so  that  64  (26)  protonation  states  are  possible  for 
the  six  sites.  The  fact  that  these  are  exactly  re¬ 
peated  along  the  helical  axis  results  in  the  approxi¬ 
mation  that  the  infinite  number  of  protonation 
states  possible  is  represented  by  the  repeating  of 
each  of  the  64  states.  The  calculation  of  the  intrin¬ 
sic  p  Ka,  reflecting  the  tendency  of  a  site  to  accept  a 
proton  when  all  other  sites  are  neutral,  cannot  be 
exactly  determined  with  this  approach  because 
protonating  a  single  site  in  the  repeat  results  in 
that  site  being  protonated  in  each  of  its  images. 
However,  the  repeat  unit  extends  39.36  A  making 
this  image-site-central-site  interaction  energy 
small.  We  present  a  value  for  the  intrinsic  p  Ka  but 
note  its  approximate  nature. 
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The  64  states  include  6  singly  protonated,  15 
doubly  protonated,  20  triply  protonated,  15  with 
four  of  the  six  sites  protonated,  and  6  with  five  of 
the  six  sites  protonated.  In  addition  there  is  one 
state  that  is  not  protonated  and  one  that  is  fully 
protonated.  The  summations  in  Eq.  (1)  are  over 
these  64  states.  Because  the  sites  are  equivalent, 
the  extent  of  protonation  of  a  single  site  is  all  that 
needs  to  be  considered. 

The  site-site  interaction  term  W^tV  for  an  in¬ 
finitely  repeating  polymer  was  determined  using 
the  nearest-neighbor  (1,2)  and  next-nearest  (1,3) 
interactions  calculated  using  Eq.  (5)  based  on  the 
PB-determined  electrostatic  potentials.  Recogniz¬ 
ing  that  the  PB  calculation  includes  the  effect  of 
images  of  the  central  repeat,  the  PB-calculated 
effect  of  charging  site  2  in  the  presence  of  a  charge 
on  site  1  (and  its  images)  can  be  written  as: 

1  =  fli,2  =  W1/2  =  Wlf2  +  2£w1#6/+1.  (6) 

i 

Similarly,  the  next-nearest  site-site  interactions 

«U+ 2  =  fli,3  =  Wu  +  2EWWj+,  (7) 

i 

The  indices  on  ft^  v  range  from  1  to  6  while  the  j 
subscript  on  Wlf .  can  increase  without  bound.  To 
calculate  the  remaining  ft  ,  we  calculated  the 
coulombic  sums 

n 

W;,I+i(n)  =  l/rM+1  +  2£l/r1/6y+f+1 

i 

n 

Wi/I+2(  n)  =  l/ri/i+2  +  2£l/r16/+i+2 


i+5(«)  =  1  /ru+s  +  2£l /rii6j+i+5. 

j 

The  PB  calculations  were  performed  for  a  variety 
of  conditions,  leading  to  different  calculated  val¬ 
ues  of  ft and  ft13.  For  each  PB  calculation  the 
ratio  ft:  2/ftx  3  =  x23  was  determined.  The  cou¬ 
lomb  sums  were  then  truncated  at  n  such  that 
W12(w)/W13(n)  =  x23.  This  value  of  n  was  then 
used  to  determine  the  ratios  *34  =  WM(n)/ 
WM(n),  x45,  and  x56.  Finally,  we  invoked  the 
approximation  ft14  =  ft  1,3/234/  ft  1,5  =  ft 1, 4/245, 
and  ftx  6  —  ftx  5/x56. 


Results  and  Conclusions 

Poisson-Boltzmann  calculations  were  per¬ 
formed  using  methods  previously  described  [25]. 
The  DNA  and  counterions  were  confined  to  a 

o 

cylindrical  cell  of  radius  100  A  corresponding  to  a 
nucleotide  concentration  of  40  mM.  A  concentra¬ 
tion  of  100  mM  1:1  monovalent  salt  was  added  to 
the  DNA-counterion  mixture  resulting  in  140  mM 
monovalent  cations  and  100  mM  monovalent  an¬ 
ions  surrounding  the  central  DNA  molecule.  Fol¬ 
lowing  Bashford  and  Karplus  [14],  PB  calculations 
were  done  on  the  model  compound  (neutral  and 
protonated)  deoxycytidine,  with  the  same  grid 
used  for  the  full  DNA  calculations.  Several  PB 
calculations  were  required  for  the  central  DNA. 
The  fully  unprotonated  (i.e.,  the  triplex  had  a 
charge  of  -36)  calculation  was  done  along  with 
cytosine  1  protonated,  cytosines  1  and  2  proto¬ 
nated,  and  cytosines  1  and  3  protonated.  The  dou¬ 
bly  protonated  calculations  were  required  for  the 
evaluation  of  ft  [Eq.  (5)]. 

Calculations  were  performed  assuming  that  the 
internal  dielectric  constant  of  DNA  had  a  uniform 
value  of  4  and  that  of  the  environment  was  78.4. 
This  gave  a  pKa  of  4.1  for  the  third-strand  cyto¬ 
sine.  Based  on  a  pKfl  of  4.3  for  the  model  com¬ 
pound,  incorporation  of  cytosine  into  the  negative 
electrostatic  potential  environment  of  DNA  would 
be  expected  to  raise  its  p  Ka  by  perhaps  2  units,  as 
discussed  earlier.  The  intrinsic  p Ka  [Eq.  (2)]  of  6.7 
calculated  for  these  conditions  is  not  unreasonable, 
but  this  value  is  lowered  to  4.1  by  the  site-site 
interactions.  This  result  suggests  that  the  internal 
DNA  dielectric  constant  to  be  used  in  the  calcula¬ 
tion  of  ft^  „  should  be  larger  than  the  value  of  4 
used  in  the  intrinsic-pKfl  determination  [17,  18]. 

A  second  set  of  calculations  using  an  internal 
DNA  dielectric  constant  of  10  yielded  a  p  Ka  of  8.7, 
a  value  high  compared  to  experiment.  The  titration 
curves  for  protonation  of  the  third-strand  cytosine, 
calculated  using  Eq.  (1)  are  shown  in  Figure  3.  The 
intrinsic  p Ka  curve,  assuming  no  site-site  interac¬ 
tions,  is  shown  along  with  the  full  titration  curve 
to  emphasize  the  role  of  those  interactions,  which 
shift  the  midpoint  of  the  curve  from  10.2  to  8.7.  A 
third  set  of  calculations  with  a  DNA  dielectric 
constant  of  4  and  a  variable  dielectric  constant  for 
the  environment  [23]  was  also  done.  The  results  of 
all  these  calculations  are  summarized  in  Table  I. 
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FIGURE  3.  Proton  titration  curve  for  the  N3  of  cytosine  on  the  third  stand  of  triple-helical  DNA.  The  internal  dielectric 
constant  for  DNA  was  10  and  for  the  environment  it  was  78.4.  The  curve  to  the  right  represents  the  curve  calculated 
assuming  no  site-site  interaction. 


eps(dna)  =  10. 
eps(env)  =  78.4 


Further  insight  into  the  free  energy  differences 
accompanying  protonation  of  the  third-strand  cy¬ 
tosine  can  be  gained  by  molecular  orbital  calcula¬ 
tions.  Using  a  geometry  determined  from  an  un¬ 
constrained  optimization,  we  calculated  the  wave 
function  for  the  C+~G-C  base  triad  within  the 
PM3  approximation.  Figure  4  shows  the  electro¬ 
static  potential  in  the  regions  surrounding  each 
base.  When  the  proton  is  removed  and  the  electro¬ 
static  potential  recalculated  (without  further  opti¬ 
mization)  a  highly  negative  region  appears  at  the 
position  vacated  by  the  proton,  as  indicated  in 
Figure  5.  (Additional  optimization  leads  to  a  struc¬ 
ture  in  which  the  exocyclic  amine  of  the  third-stand 
cytosine  forms  hydrogen  bonds  with  both  the  gua¬ 
nine  N 7  and  carbonyl  oxygen.)  This  supports  the 


TABLE  I _ 

p Ka  and  intrinsic  p Ka  for  the  three  dielectric 
constant  combinations  described  in  the  text. 


e(DNA)  /  e(env) 

pKintm 

P  Ka 

4./  variable 

14.23 

10.00 

10/78.4 

10.18 

8.70 

4./ 78.4 

6.73 

4.05 

suggestion  of  Lavelle  and  Fresco  [8]  that  the  pres¬ 
ence  of  the  proton  is  not  primarily  to  stabilize  the 
triplex  by  forming  another  hydrogen  bond  but 
rather  to  negate  the  strong  electrostatic  repulsion 
caused  by  overlapping  of  the  lone-pair  electrons 
on  the  guanine  N 7  and  cytosine  N3  atoms. 


Conclusions 

The  calculation  of  the  p  K(l  shift  of  an  ionizable 
site  at  the  DNA  surface  can  be  accomplished  with 
a  fair  degree  of  accuracy  by  calculating  the  electro¬ 
static  potential  in  the  environment  adjacent  to  the 
site  of  protonation  [26].  The  calculation  of  this 
potential  is  affected  little  by  assumptions  regard¬ 
ing  the  dielectric  constant  of  the  DNA  interior. 
Sites  such  as  the  N3  of  the  third-strand  cytosine, 
however,  present  a  greater  challenge.  Buried  within 
the  DNA,  the  dielectric  constant  chosen  for  the 
polyion  interior  has  a  great  influence  on  the  calcu¬ 
lated  p  Kn  shift.  The  assumption  that  the  dielectric 
constant  of  the  interior  of  DNA  can  be  represented 
by  a  single,  isotropic,  scalar  quantity  presents  a 
further  difficulty.  At  this  stage  it  seems  that  the 
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accurate  prediction  of  p  Ka  shifts  within  the  inte¬ 
rior  of  DNA  awaits  further  developments. 
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ABSTRACT:  The  integrated  molecular  transform  (FTm)  is  a  unitary  numerical  index 
of  structure  that  is  capable  of  uniquely  representing  different  molecular  structure 
conformations  with  the  exception  of  enantiomers.  Other  molecular  indices  have  been 
derived  from  FTm  as  well  as  from  the  normalized  molecular  moment  ( Mn ),  for  example, 
the  analogous  electronic  and  charge  transforms  ( FTe  and  FTC)  and  moments  (Me  and 
Mc).  In  this  study,  each  of  these  indices  was  calculated  for  up  to  10  sampled  conformations 
of  each  of  the  normal  alkanes  as  they  were  subjected  to  a  standard  annealing 

process.  Statistical  analyses  of  the  resulting  data  in  the  individual  series  and  subsequent 
box  plots,  permitting  facile  examination  of  those  results,  indicated  that  the  respective 
transform  indices  ( FTm ,  FTe,  FTC )  are  unique,  that  is,  with  no  statistically  significantly 
overlap  across  the  series.  For  the  Mn  and  Me  indices,  the  numerical  values  for  methane 
overlapped  those  of  ethane  in  the  first  instance  and  both  ethane  and  propane  in  the 
second.  The  Mc  index  values  overlapped  in  several  instances  in  the  series.  Inasmuch  as 
the  noted  molecular  indices  are  based  only  on  parameters  of  structural  origin,  these 
results  have  profound  implications  for  the  correlation  and  estimation  of  properties 
derived  not  only  from  a  general  structure  representation,  but  also  for  those  properties 
which  may  be  dependent  on  specific  molecular  conformations.  This  includes  the  potential 
for  indices  of  molecular  flexibility  and  conformationally  dependent  atomic  electron 
densities.  ©  1998  John  Wiley  &  Sons,  Inc.  Int  J  Quant  Chem  70:  1185-1194,  1998 


Introduction 

The  integrated  molecular  transform  (FTm)  and 
normalized  molecular  moment  (Mn)  indices 
and  their  analogous  electronic  and  charge  indices 

Correspondence  to:  J.  W.  King. 


are  unitary  numerical  surrogates  of  either  opti¬ 
mized  (at  any  level)  or  nonoptimized  structures. 
The  indices  are  derived  from  considerations  based 
solely  on  structure  parameters,  that  is,  bond  dis¬ 
tances,  interatomic  distances,  atomic  number, 
atomic  weight,  and /or  quantum  mechanical  calcu¬ 
lations.  In  the  most  literal  sense,  the  indices  are  the 
result  of  mapping  down  processes  that  convert  a 
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descriptorially  multivariate  entity,  the  molecule, 
into  a  single  number,  and  with  each  index,  a 
specific  feature  of  the  molecule  may  be  empha¬ 
sized.  Inasmuch  as  the  indices  have  been  precisely 
defined  in  a  recent  publication  [1],  their  origins 
will  not  be  further  reviewed  herein. 

In  general,  the  application  of  the  various  indices 
has  been  to  correlate  the  structure  with  the  chemi¬ 
cal,  physical,  and  pharmacological  properties  with 
a  view  toward  extrapolation  or  interpolation  capa¬ 
bilities  [1-12].  However,  their  versatility  permits 
an  emphasis  on  specific  molecular  aspects  as  well, 
for  example,  only  the  structure  ( FTm  and  M„),  or, 
if  desired,  the  electronic  ( FTC  and  Mt,)  or  charge 
nature  ( FTC  and  Mc )  of  molecules  or  a  combina¬ 
tion  of  the  indices  in  a  multivariate  correlation 
equation.  Thus,  any  molecular  attribute  may  be 
incorporated  by  mathematical  representation  in 
one  of  the  indices. 

One  of  the  concepts  resulting  from  the  existence 
of  the  unitary  indices  is  that  a  measure  of  molecu¬ 
lar  similarity  is  permitted  by  a  comparison  of 
ratios  or,  perhaps,  other  mathematical  formula¬ 
tions  of  the  respective  indices  [5,  12].  But  also  of 
importance  is  the  fact  that,  with  the  exception  of 
enantiomers,  the  indices  have  been  shown  to 
uniquely  represent  conformers  [9].  In  that  study, 
different  conformations  of  ethane,  toluene,  and 
biphenyl  were  shown  to  be  uniquely  represented 
by  the  integrated  molecular  transform  ( FTnl ).  With 
such  a  result  in  hand,  it  seemed  prudent  to  at¬ 
tempt  a  more  general  demonstration  of  this  index 
capability  and  extend  it  to  the  corollary  indices 
noted  above.  That,  then,  was  the  objective  of  the 
work  reported  herein.  In  this  context,  each  of  the 
C^-Qo  alkanes  was  subjected  to  an  annealing  pro¬ 
cess  and  up  to  10  conformations  sampled.  The 
structure  indices  were  then  calculated  for  each 
conformer  and  a  statistical  comparison  of  the  re¬ 
sults  depicted  by  box  plots  of  the  data. 


Methodology  and  Results 

The  initial  alkane  structures  were  entered  into 
the  Chem3D  Pro  molecular  mechanics  program 
[13].  The  molecular  dynamics  subset  of  this  pro¬ 
gram  provided  the  annealing  process  for  each  of 
the  10  alkanes  to  give  conformational  continuums. 
Each  continuum  was  then  randomly  sampled  to 
give  nine  or  ten  conformations  which  yielded  the 
interatomic  distances  needed  for  calculation  of  the 


FTm  and  Mn  indices.  The  single-point  energies 
were  then  calculated  for  each  conformer  by  the 
GAMESS  program  (operating  in  the  MOP  AC  mode 
with  the  AMI  Hamiltonian  [14])  to  give  the  neces¬ 
sary  electron  densities  for  calculation  of  FTC  and 
Mc  and  charge  distributions  for  calculation  of  FTC 
and  Mc.  The  transform  and  moment  indices  were 
then  calculated  by  previously  described  methods*; 
these  are  shown  in  Table  I.  The  resulting  indices 
for  each  conformer  series  were  then  statistically 
examined  with  the  SCAN  program  [15]  to  give  the 
summary  data  shown  in  Table  II.  The  box  plots 
shown  in  Figures  1-6  were  generated  with  Sigma 
Plot  [16]. 

CHARACTERISTICS  01  BOX  PLOTS 

For  ease  of  interpretation  of  the  box  plots,  it 
should  be  noted  that  the  width  of  the  box  is 
arbitrary  and  has  no  meaning  in  respect  to  the 
plotted  data.  The  top  and  bottom  limits  of  the  box 
are  the  75th  and  25th  percentiles  of  the  data,  re¬ 
spectively,  while  the  data  median  is  noted  by  the 
horizontal  line  across  the  center  of  the  box.  The 
line  crossing  the  "T"  on  the  top  of  the  box  shows 
the  95%  confidence  limit  of  the  data;  the  inverted 
"T"  on  the  bottom  of  the  box  represents  the  5% 
confidence  limit. 


Discussion 

The  previous  study  of  numerical  conformer  rep¬ 
resentation  proved  that  conformers  may  be 
uniquely  represented  by  their  integrated  molecular 
transform  (FTW)  [1].  However,  the  question  can  be 
posed  as  to  whether  there  might  be  some  numeri¬ 
cal  overlap  in  a  more  regular  compound  series, 
such  as  alkanes,  differing  by  only  a  methylene 
group.  Further,  the  application  of  the  integrated 
electronic  (FTC)  and  charge  ( FTC )  indices,  the  nor¬ 
malized  molecular  moment  (M„),  and  its  analo¬ 
gous  electronic  (Mt,)  and  charge  moments  (Mc)  to 
conformational  representation,  had  not  been 
demonstrated. 

The  results  of  this  study  are  best  seen  by  a 
perusal  of  the  figures.  In  Figure  1,  the  most  un¬ 
usual  aspect  is  the  slightly  lower  displacement  of 
the  methane  conformer  group  as  compared  to  the 
other  groups.  It  is  not  difficult  to  account  for  this 
inasmuch  as  there  appears  to  be  a  linear  relation- 

*Sec  [1]  and  citations  therein. 
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TABLE  I 


Calculated  indices  for  Cj-C^  normal  alkanes  (see  text  for  an  explanation  of  column  headings). 


Carbons 

FTm 

FTe 

FTC 

Mn 

Me 

Mc 

Cl 

6.987576 

4.579256 

0.024590 

0.810334 

1.124986 

0.428260 

Cl 

6.712426 

4.390923 

0.023975 

0.935001 

1.125000 

0.392177 

Cl 

6.578948 

4.297664 

0.024565 

0.935001 

1.125000 

0.399510 

Cl 

6.458280 

4.222586 

0.023499 

0.748001 

0.874989 

0.465629 

Cl 

6.210833 

4.050942 

0.022874 

0.872668 

0.999988 

0.415565 

Cl 

6.421484 

4.197506 

0.023717 

0.623334 

1.000013 

0.498238 

Cl 

6.173242 

4.023162 

0.022598 

0.498667 

0.750000 

0.530247 

Cl 

6.513076 

4.255519 

0.024212 

0.872668 

1.125000 

0.411798 

Cl 

6.147553 

4.011489 

0.023019 

0.685668 

0.750000 

0.495021 

Cl 

6.290279 

4.104471 

0.022979 

0.748001 

0.999988 

0.424484 

C2 

41.179925 

24.746405 

0.057962 

0.798147 

0.857143 

0.336114 

C2 

41 .047766 

24.656884 

0.057628 

0.798147 

0.857143 

0.337408 

C2 

41 .096322 

24.707458 

0.057960 

0.798147 

0.857143 

0.334909 

C2 

41.196547 

24.758945 

0.057378 

0.798147 

0.857137 

0.334263 

C2 

41.291619 

24.768512 

0.057351 

0.79814 7 

0.857143 

0.340662 

C2 

40.614203 

24.365527 

0.056042 

0.764891 

0.857143 

0.339907 

C2 

40.465850 

24.285497 

0.056734 

0.798147 

0.857137 

0.340457 

C2 

41 .404900 

24.811028 

0.056462 

0.764891 

0.857137 

0.341087 

C2 

40.239286 

24.168651 

0.057318 

0.798147 

0.857143 

0.339892 

C2 

41.119917 

24.718319 

0.058000 

0.764891 

0.857143 

0.335669 

C3 

61 .730045 

35.366721 

0.073667 

1 .065844 

1.050000 

0.439550 

C3 

61 .583751 

35.281740 

0.074173 

1 .065844 

1 .050000 

0.441290 

C3 

61 .509062 

35.254482 

0.073042 

1.043166 

1.049995 

0.439138 

C3 

61 .942748 

35.432699 

0.073217 

1 .020489 

1.050000 

0.434095 

C3 

60.412583 

34.717018 

0.074872 

1 .065844 

1 .049990 

0.449778 

C3 

62.748389 

35.919357 

0.073979 

1.020489 

1.049995 

0.437397 

C3 

62.05241 1 

35.487362 

0.070601 

1 .088521 

1 .050000 

0.438612 

C3 

62.015678 

35.468971 

0.074696 

1 .020489 

1.000000 

0.443057 

C3 

61 .930000 

35.284296 

0.070664 

1.043166 

1.000005 

0.432263 

C3 

59.822879 

34.274096 

0.073421 

1 .065844 

1.050005 

0.454840 

C4 

79.241786 

44.492780 

0.092804 

1 .376382 

1.461561 

0.498671 

C4 

78.494188 

44.057310 

0.091857 

1 .376382 

1.461533 

0.499229 

C4 

79.409604 

44.548020 

0.093429 

1.359177 

1.461550 

0.498600 

C4 

77.188174 

43.300959 

0.091158 

1 .376382 

1.461533 

0.496246 

C4 

78.901902 

44.267885 

0.094077 

1.359177 

1 .423071 

0.502933 

C4 

78.251678 

43.921354 

0.090537 

1.359177 

1.461533 

0.504800 

C4 

77.297306 

43.358956 

0.088922 

1.359177 

1.461533 

0.503471 

C4 

76.735552 

43.077931 

0.088287 

1 .393587 

1.461527 

0.494185 

C4 

77.007562 

43.161377 

0.086065 

1 .376382 

1.461527 

0.486074 

C4 

77.637438 

43.435717 

0.092562 

1.35917 7 

1 .423077 

0.506593 

C5 

92.856293 

51.446007 

0.109185 

1 .593895 

1.625010 

0.558995 

C5 

92.930752 

51.405263 

0.106200 

1 .607755 

1 .625000 

0.560091 

C5 

93.120414 

51.538966 

0.106176 

1 .607755 

1 .593740 

0.568756 

C5 

92.914801 

51.512217 

0.110178 

1 .593895 

1 .656250 

0.545212 

C5 

93.569230 

51.788255 

0.107119 

1 .593895 

1 .593750 

0.566034 

C5 

91 .997341 

50.835562 

0.102806 

1.621615 

1 .656255 

0.556445 

C5 

93.675362 

51.862599 

0.105841 

1 .593895 

1 .593745 

0.556779 

C5 

90.718127 

50.260480 

0.108838 

1 .635475 

1 .656250 

0.573272 

C5 

92.203865 

51.060317 

0.104319 

1 .607755 

1 .624995 

0.547331 

C5 

90.992395 

50.251630 

0.099873 

(Continued) 

1.663195 

1 .625000 

0.585893 
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TABLE  I  _ 
(Continued). 


Carbons 

FTm 

FTe 

FTC 

Me 

Mc 

C6 

111.808612 

61.709386 

0.128761 

1 .879848 

1 .894727 

0.593127 

C6 

110.495134 

60.809936 

0.122568 

1.891452 

1.921053 

0.606570 

C6 

109.231913 

60.246225 

0.126034 

1.914660 

1 .894732 

0.598337 

C6 

108.588437 

60.239320 

0.127193 

1 .984284 

1 .947368 

0.598246 

C6 

111.569415 

61.740502 

0.118918 

1 .903056 

1.868426 

0.609694 

C6 

109.237809 

60.094957 

0.119726 

1 .926264 

1 .894732 

0.552992 

C6 

1 1 2.687464 

61.918144 

0.117628 

1 .845036 

1 .789474 

0.628347 

C6 

111.881502 

61 .679248 

0.117797 

1.891452 

1.842110 

0.630124 

C6 

109.855346 

60.412243 

0.118532 

1.879848 

1.842105 

0.583331 

C7 

127.835993 

70.024476 

0.145468 

2.215480 

2.090909 

0.612861 

C7 

127.201751 

69.660087 

0.142529 

2.185541 

2.136349 

0.625831 

C7 

127.993452 

70.097949 

0.143057 

2.155602 

2.091384 

0.641087 

C7 

129.428918 

70.853597 

0.141977 

2.135642 

2.136364 

0.646357 

C7 

127.633380 

69.857448 

0.138810 

2.165581 

2.068182 

0.622345 

C7 

125.677923 

68.741534 

0.137854 

2.155602 

2.090919 

0.610405 

C7 

122.874897 

67.052826 

0.142038 

2.155602 

2.136359 

0.618217 

C7 

46.974974 

24.615089 

0.086652 

4.141550 

4.227273 

1.208002 

C7 

127.022915 

69.493979 

0.137449 

2.145622 

2.181808 

0.650266 

C7 

127.061895 

69.649755 

0.138369 

2.175561 

2.113636 

0.646219 

C8 

149.080229 

81.497520 

0.165194 

2.363633 

2.439995 

0.651653 

C8 

149.613119 

81.806698 

0.164695 

2.407404 

2.420000 

0.662258 

C8 

149.735516 

81.845108 

0.163517 

2.389896 

2.439995 

0.660822 

C8 

148.852683 

81.309880 

0.162809 

2.398650 

2.440000 

0.674522 

C8 

148.508235 

81.056824 

0.159152 

2.389896 

2.440005 

0.673042 

C8 

147.340604 

80.509246 

0.160919 

2.407404 

2.440005 

0.649816 

C8 

146.021563 

79.811669 

0.161177 

2.424913 

2.420005 

0.636731 

C8 

143.321614 

78.244635 

0.158898 

2.389896 

2.419995 

0.615587 

C8 

142.850651 

77.994129 

0.161490 

2.381142 

2.359995 

0.631566 

C8 

145.980078 

79.584763 

0.156405 

2.424913 

2.419981 

0.627274 

C9 

171.063920 

93.322880 

0.184141 

2.736676 

2.625000 

0.644124 

C9 

171.351757 

93.438945 

0.183369 

2.728879 

2.660710 

0.650103 

C9 

169.148191 

92.130596 

0.181065 

2.760066 

2.785714 

0.655126 

C9 

170.326638 

92.865610 

0.181896 

2.697692 

2.678581 

0.675747 

C9 

170.302665 

92.800450 

0.177425 

2.721082 

2.714291 

0.686287 

C9 

170.352554 

92.947907 

0.181377 

2.697692 

2.696424 

0.701479 

C9 

170.100057 

92.730363 

0.180439 

2.752269 

2.696424 

0.672828 

C9 

170.100057 

92.730363 

0.180439 

2.752269 

2.696424 

0.672828 

C9 

165.154138 

89.781148 

0.171287 

2.619724 

2.607138 

0.624933 

C9 

164.730054 

89.728341 

0.177340 

2.565146 

2.535719 

0.651846 

CIO 

192.191653 

104.606044 

0.203915 

3.001026 

2.967747 

0.670474 

CIO 

192.191653 

104.606044 

0.203915 

3.001026 

2.967747 

0.670474 

CIO 

192.261743 

104.551337 

0.200315 

3.022110 

3.032243 

0.68621 5 

CIO 

189.540819 

103.000049 

0.198518 

3.029138 

2.983866 

0.705895 

CIO 

189.303787 

102.864455 

0.196671 

3.029138 

2.983866 

0.711553 

CIO 

191.010092 

103.860264 

0.197011 

2.979941 

2.935489 

0.705731 

CIO 

192.270043 

104.492034 

0.199814 

3.015082 

2.967742 

0.666907 

CIO 

187.802697 

101.936861 

0.194529 

3.029138 

3.032258 

0.646869 

CIO 

188.894104 

102.551026 

0.195955 

3.008054 

2.983876 

0.633660 

CIO 

186.937441 

101.260288 

0.189922 

2.902632 

2.854830 

0.672898 
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TABLE  II _ 

Statistical  aspects  of  the  calculated  molecular  indices  of  the  C.,-C10  normal  alkanes. 


PARAMETRIC  TRANSFORM  /MOMENT  INDICES  IN  MD  OF  n-ALKANES 
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FIGURE  1.  FTm  versus  carbon  number. 


ship  between  the  other  groups  consistent  with 
the  general  characteristic  of  larger  FTm  values  as 
molecular  weight  increases.  For  methane,  it  may 
be  that  there  is  not  enough  structural  variation  in 
its  conformers  in  respect  to  those  of  the  C2-C10 
alkanes.  But,  more  importantly,  there  is  no  overlap 
of  index  values  in  the  series.  The  same  comments 
may  be  noted  for  the  FTe  indices  plotted  in  Figure 
2.  In  Figure  3,  the  displacement  from  strict  linear¬ 


ity  of  the  FTC  index  of  the  methane  conformers 
with  respect  to  the  remainder  of  the  series  appears 
to  be  less  than  for  the  previously  noted  indices; 
this  may  be  due  in  part  to  the  somewhat  artificial 
consideration  of  the  alkanes  as  charged  species. 
Again,  there  are  no  overlaps  of  index  values. 

Figure  4  is  a  plot  of  Mn  for  the  series.  In  this 
case,  the  behavior  of  methane  in  respect  to  the 
other  hydrocarbons  is  unusual.  Inasmuch  as  there 


FIGURE  2.  FTe  versus  carbon  number. 
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FIGURE  3.  FTC  versus  carbon  number. 


can  be  only  limited  structural  variation  in  this 
molecule  as  it  is  subjected  to  an  annealing  process, 
an  explanation  of  the  wide  index  range,  as  re¬ 
flected  by  the  vertical  dimensions  of  the  box,  re¬ 
mains  obscure.  This  is  true  also  for  the  Mc  index 
plotted  in  Figure  5.  Figure  6  is  the  plot  of  Mc  for 
the  series  and  is  the  most  unusual  of  all,  with 
methane  again  having  the  most  variant  behavior. 
But  the  appearance  of  the  plots  for  the  C5-C10 
alkanes  also  defies  explanation  other  than,  as  noted 


for  the  FTC  data,  to  consider  that  a  general  repre¬ 
sentation  of  the  alkanes  as  charged  species  is  not 
appropriate. 

Perhaps  the  most  interesting  aspect  of  the  box 
plots  are  that,  for  each  alkane,  they  give  a  visual 
indication  of  the  variation  in  the  index  range.  For 
instance,  in  Figure  1,  one  could  surmise  that  the 
C4,  C6,  C8,  and  C10  alkanes,  by  virtue  of  their 
greater  index  ranges  as  compared  to  the  rest  of  the 
series,  are  more  flexible  than  are  the  compounds 


FIGURE  4.  Mn  versus  carbon  number. 
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FIGURE  5.  Me  versus  carbon  number. 


with  an  odd  number  of  carbon  atoms.  In  Figure  4, 
the  reverse  appears  to  be  the  case,  that  is,  com¬ 
pounds  with  an  odd  number  of  carbon  atoms 
generally  appear  to  have  the  wider  index  range. 
While  this  introduces  a  degree  of  dichotomy,  inas¬ 
much  as  these  two  indices  ( FTm  and  Mn,  respec¬ 
tively)  are  really  structure  indices  in  a  strict  sense, 
such  generalizations  may  really  be  indicative  of 
experimental  behavior,  with  the  FTm  index,  be¬ 
cause  of  the  nature  of  its  derivation,  being  the 


most  reliable.  The  FTe  index  of  Figure  2  appears  to 
follow  the  pattern  of  Figure  1  and  thus  would  be 
confirmatory.  The  Me  index  of  Figure  5  is  less 
consistent  in  its  pattern  than  its  Mn  counterpart, 
and  as  it  reflects  the  electronic  nature  of  the 
molecules,  probably  no  pattern  should  be  pre¬ 
sumed.  For  the  respective  charge  indices  shown  in 
Figures  3  (FTC)  and  6  (Mc),  no  clear  pattern 
emerges  except  for  a  tendency  toward  a  nonlinear 
relationship  between  the  compounds  and  this  again 


FIGURE  6.  Mc  versus  cargon  number. 
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suggests  that  these  alkanes  may  not  be  well  repre¬ 
sented  as  charged  species  or  that  the  energy  rela¬ 
tionships  in  the  series  may  not  be  adequately  re¬ 
flected  in  the  present  calculations.  One  must  also 
consider  that  the  transform  indices  distance  pa¬ 
rameters  are  interatomic  while  the  moment  dis¬ 
tances  are  from  each  atom  to  the  geometric  center 
of  the  molecule.  The  molecular  representational 
quality  of  this  difference  in  conformer  depiction 
remains  to  be  established. 


Conclusions 

This  study  has  shown  that  conformers  of  each 
normal  alkane  in  the  C  i  C10  series  may  be 
uniquely  numerically  represented  by  the  inte¬ 
grated  molecular  transform  (FTm).  Similarly,  the 
integrated  electronic  and  charge  transforms  ( FTC 
and  FTC,  respectively)  uniquely  interpret  those  as¬ 
pects  across  the  series,  that  is,  there  are  no  numeri¬ 
cal  overlaps  of  index  values,  although  the  specific 
methane  conformer  values  in  each  case  prevent  a 
strict  linear  relationship  in  the  series.  For  the  nor¬ 
malized  molecular  moment  (M;f)  and  the  normal¬ 
ized  electronic  moment  (M(1),  methane  is  an  outlier 
whose  numerical  values  overlap  those  of  ethane  in 
the  first  instance  and  both  ethane  and  propane  in 
the  second.  In  the  case  of  the  normalized  charge 
moment  (Mr),  the  extreme  range  of  values  for  each 
alkane  results  in  several  overlaps  between  the  re¬ 
spective  compounds,  suggesting  that  this  particu¬ 
lar  index  is  not  suitable  for  this  series. 

The  box  plots  of  the  data  in  this  study  visually 
articulate  the  respective  index  values  for  the  com¬ 
pounds.  The  variation  in  the  range  of  such  data  for 
each  alkane  may  be  an  indicator  of  conformational 
flexibility  in  the  case  of  the  FTn,  and  Mn  indices. 
Similar  considerations  for  the  electronic  indices 
( FTC  and  Mr)  may  give  an  indication  of  the  depen¬ 
dence  of  atomic  electron  density  on  structural  vari¬ 
ation.  Work  continues  to  effect  a  numerical  defini¬ 
tion  of  these  considerations. 
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ABSTRACT:  In  order  to  determine  the  structural  requirements  that  are  important  for 
GABAb  binding  affinity,  a  quantum-chemical-based  conformational  study  has  been 
performed,  followed  by  a  similarity  analysis  which  includes  12  GABAb  analogs.  Due 
to  the  flexibility  of  the  structures,  a  semigrid  GABAb  analog  [2RS-(5,5-dimethyl) 
morpholinyl-acetic  acid]  has  been  used  as  a  template  for  the  amonium  moiety  in  order  to 
help  to  identify  the  active  conformation.  Both  in  vacuo ,  and  solvent-simulated  calculations, 
for  the  physiological  media  modeled  as  water  molecules,  have  been  compared,  for  this 
analog,  at  ab  initio  (G94,  6-31  +  G(d,p))  and  semiempirical  (PM3)  levels,  respectively.  On 
the  basis  of  this  comparison,  the  results  of  in  vacuo  PM3  calculations  have  been  chosen 
for  the  similarity  analysis.  We  have  included,  in  the  calculations,  a  group  of  molecules 
heterogeneous  enough  to  become  representative  of  the  different  families  that  can  bind  to 
the  GABAb  receptor  site.  Following  their  comparison  we  report  the  leading  characteristics 
that  can  be  related  to  their  binding  capability  and  define  a  pharmacophoric  pattern  for 
GABAb  analogs.  The  latter  is  compared  with  the  one  previously  found  for  the  binding 
affinity  at  the  GABAa  receptor  site.  ©  1998  fohn  Wiley  &  Sons,  Inc.  Int  J  Quant  Chem  70: 
1195-1208,  1998 
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Introduction 

Inhibition  and  excitation  in  the  central  nervous 
system  (CNS)  are  mainly  controlled  by  either 
y-aminobutyric  acid  (GABA)  or  L-glutamate  neu¬ 
rotransmitters.  GABA,  like  other  neurotransmit¬ 
ters,  including  L-glutamate,  serotonin,  and  acetyl¬ 
choline,  activates  both  ionotropic  (GABAa)  and 
metabotropic  (GABAb)  receptors  [1-4].  Whereas 
the  GABAa  receptor  was  cloned  a  decade  ago, 
success  in  cloning  the  GABAb  receptor  is  more 
recent  (1997)  [1,  5].  The  ionotropic  GABAa  recep¬ 
tors  are  ligand-gated  ion  channels  that  produce 
fast  synaptic  transmission.  Metabotropic  GABAb 
receptors,  on  the  other  hand,  couple  to  G  proteins 
(guanine-nucleotide-binding  proteins)  and  pro¬ 
duce  several  divergent  effects  through  intracellular 
effector  systems:  opening  of  adjacent  potassium 
channels,  closure  of  voltage-gated  calcium  chan¬ 
nels,  and  inhibition  of  the  enzyme  adenylyl  cyclase 
[2,  5].  The  effects  of  stimulating  GABAb  receptors 
are,  thus,  slower  and  lead  to  a  more  prolonged 
postsynaptic  inhibition,  associated  with  some  types 
of  learning  and  memory. 

Baclofen,  a  GABAb  agonist,  was  introduced  in 
the  market  in  1972  and  mainly  used  for  its  thera¬ 
peutic  potential  in  several  respiratory  diseases, 
such  as  asthma  [2,  3].  Since  then,  analogs  of  ba¬ 
clofen,  saturated  and  unsaturated,  have  been  syn¬ 
thesized  and  tested  for  GABAb  receptor  affinity 
[6-9].  The  phosphonic  and  sulfonic  analogs 
(phaclofen  and  saclofen,  respectively),  as  well  as 
the  2-hydroxy  derivative  of  the  latter  [10,  11],  have 
been  shown  to  be  antagonists  at  the  GABAb  recep¬ 
tor  and  used  as  neuroprotec tive  drugs  [2]  to  treat 
spasticity,  absence  epilepsy,  anxiety,  depression, 
and  cognition  deficits,  as  well  as  the  respiratory 
depression  caused  by  excessive  doses  of  GABAb 
agonists  [1],  Antidepressant  properties  have  also 
became  apparent  for  GABAb  agonists  [1,  3,  4].  In 
this  framework,  the  conformational  analysis  of 
several  baclofen  analogs  has  demonstrated  the  im¬ 
portance  of  lipophilic  substitutions  in  the  het- 
eroaromatic  ring  to  increase  the  binding  affinity 
[6,  12]. 

The  clinical  importance  of  GABAb  analogs  has 
stimulated  the  research  in  this  field.  A  new  class  of 
potent  phosphinic  GABAb  antagonists  has  been 
recently  described  by  Froestl  and  co-workers  [3, 
13].  Lipophilic  groups  bound  to  both  the  phospho¬ 
rous  and  nitrogen  atoms  also  appear  as  necessary 


for  the  GABAb  binding  affinity  to  become  signifi¬ 
cant.  Even  later;  morpholine-2-acetic  acid  deriva¬ 
tives  have  been  reported  as  GABAb  antagonists 
[14].  Their  affinity  also  increases  after  lipophilic 
substitution  in  position  5  of  the  morpholine  ring. 

Several  conformational  analyses  have  succeeded 
in  identifying  structural  requirements  for  GABAb 
binding  affinity  [6,  12,  15].  They  have  been  based, 
however,  on  the  comparison  of  analogs  of  the 
same  class,  and  the  conclusions  derived  from  them 
are  only  valid  for  the  congeneric  family  to  which 
they  belong.  With  the  aim  of  elucidating,  in  a  less 
restrictive  manner,  the  structural  requirements  in¬ 
volved  in  accessing  the  GABAb  receptor,  we  have 
performed  a  conformational  study,  followed  by  a 
similarity  analysis  that  is  mainly  based  on  the 
comparison  of  structural  descriptors,  including,  in 
our  research,  analogs  that  belong  to  different  fami¬ 
lies.  Due  to  the  consideration  of  dissimilar  struc¬ 
tures  in  the  comparative  analysis,  the  number  of 
requirements  for  binding  affinity  derived  from  it  is 
smaller,  but  of  more  general  applicability  for  the 
evaluation  of  the  binding  capability. 

Although  cloning  has  given  some  information 
on  the  primary  structure  of  the  protein  receptor, 
more  detailed  information  about  the  GABAb  bind¬ 
ing  site  is  lacking.  Among  the  relevant  missing 
information,  mainly  relating  to  the  secondary,  ter¬ 
tiary,  and  quaternary  protein  structures,  it  should 
be  noted  that  the  nature  of  the  environment  at  the 
receptor  site  is  not  known.  However,  the  binding 
of  a  molecule  to  a  proteic  extracellular  domain  of 
the  GABAb  receptor  [1,  5],  together  with  the  evi¬ 
dence  that  lipophilic  substitutions  improve  the 
binding  capability  [6,  12,  14,  15],  discourages  the 
assumption  that  a  polar  extracellular  domain  is 
involved.  From  a  theoretical  standpoint  this  con¬ 
sideration  is  relevant,  mainly  when  dealing  with 
zwitterionic  structures  as  GABA  analogs.  Extracel¬ 
lular  hydrophilic  interactions  imply  an  environ¬ 
ment  defined  by  the  physiological  media,  of  large 
dielectric  constant,  which  is  accurately  modeled  by 
water  as  a  solvent.  Intracellular,  as  well  as  extra¬ 
cellular  interactions  involving  a  proteic  media, 
imply  a  nonpolar  environment  that  can  be  ap¬ 
proached  by  calculations  in  vacuo.  In  this  frame¬ 
work,  the  comparison  of  the  results  of  theoretical 
calculations  obtained  in  both  conditions  can  help 
in  discerning  the  characteristics  of  the  environ¬ 
ment  in  which  the  interaction  occurs.  The  compu¬ 
tational  scheme  for  the  treatment  of  GABAb 
analogs  is  further  complicated  by  the  flexibility  of 
the  structures,  that  define  several  minima,  very 
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close  in  energy,  in  the  potential  hypersurface,  and 
also  by  the  fact  that  the  active  conformations, 
associated  with  the  conformations  at  the  binding 
site,  are  not  necessarily  those  of  lower  energy, 
although,  in  general,  close  to  them.  These  compli¬ 
cations  are  generally  overcome  by  means  of  the 
consideration  of  rigid  analogs  [16],  whose  struc¬ 
tural  characteristics  help  in  discerning  the  require¬ 
ments  for  the  molecules  to  be  active.  However,  in 
contrast  to  the  case  of  GABAa  [17],  no  rigid  analog 
has  been  found  for  the  GABAb  receptor  site.  We 
have  chosen,  therefore,  the  partially  rigid  2RS- 
(5,5-dimethyl)morpholinyl-acetic  acid  as  our  tem¬ 
plate,  and  have  mainly  centered  our  research  on  a 
conformational  study  that  considers,  for  the  other 
structures,  the  requirements  imposed  by  it.  More¬ 
over,  on  the  basis  of  the  previous  discussion,  we 
have  decided  to  model  our  system  in  vacuo ,  al¬ 
though  polar  and  proteic  environments  have  been 
compared  for  the  semirigid  analog. 

The  research  presented  in  this  article,  which 
explores  the  conformational  space  of  the  GABAb 
analogs  when  interacting  at  the  receptor  site,  gives 
insight,  after  the  similarity  analysis,  into  the  con¬ 
formational  preferences  of  the  structures.  We  are 
presently  searching  for  more  detailed  information 
as  part  of  a  more  ambitious  project,  but  this 
achievement  relies,  for  the  moment,  on  the  synthe¬ 
sis  of  analogs  with  a  higher  degree  of  rigidity  or 
the  further  knowledge  of  the  receptor  site,  which 
would  allow  the  application  of  other  modeling 
resources. 


Details  of  the  Calculation  Procedure 

We  have  included  in  our  analysis  the  set  of 
compounds  listed  in  Table  I  and  shown  in  Figure 
1,  whose  elements  are  not  restricted  to  a  unique 
congeneric  family.  As  the  first  step  of  the  research 
a  thorough  conformational  study  was  performed, 
which  was  followed  by  a  similarity  analysis  where 
stable  conformations  were  compared.  These  con¬ 
formations  are  partially  defined  by  the  structure  of 
the  morpholine  acetic  acid  derivative,  chosen  as  a 
template. 

Because  there  is  no  strong  evidence  that  sup¬ 
ports  the  modeling  of  a  polar  environment  sur¬ 
rounding  the  interaction  site,  we  have  based  our 
study  on  the  results  derived  from  calculations  in 
vacuo.  However,  as  the  influence  of  a  polar  media 
has  not  been  completely  rejected,  solvent-simu- 


TABLE  I _ 

Binding  affinity  of  the  GABAb  analogs.3 


GABAb 

Binding  affinity 
(brain  membrane) 


Agonists 

bO 

GABA 

60  nM 

bl 

beta-R-hydroxy-GABA 

1.3  ^M 

b2 

R(-)-Baclofen 

60  nM 

b3 

3-aminopropylphosphinic 

acid 

1  -5  nM 

b4 

3-aminopropyl  (methyl)- 

phosphinic  acid 

0.3  nM 

Antagonists 

b5 

3-aminopropanesulphonic 

acid 

10  fiM 

b6 

R-Phacofen 

100 

b7 

R-Saclofen 

100  /xM 

b8 

4-amino-3-(5-methoxy- 

benzo-[h] 

furan-2-yl)butiric  acid 

5.5  /xM 

b9 

CGP35348 

(ileum) 

blO 

CGP36742 

100  juM 

bl  1 

CGP55845 

(vas  deferens) 
35  /x M 

b12 

2RS-[(5,5-dimethyl)morpho- 

linyl]- 

7  nM 

b13 

acetic  acid 

y-Amino-(7-methyl-benzo- 

furan)- 

3  +  1  /xM 

butiric  acid 

5.4  ix M 

abn  =  short  symbols  used  to  refer  to  them. 


lated  calculations  are  presently  being  done,  for  the 
solvent  modeled  as  a  continuum  within  an  On- 
sager  approach  [18],  in  the  framework  of  ab  initio 
G94/6-31  +  G(d,p)  calculations  [19].  The  results  of 
both  approaches  are  presented  in  this  article,  in  a 
comparative  manner,  for  the  case  of  the  morpho¬ 
line  acetic  acid  derivative,  as  a  way  of  showing 
that  solvent  simulation  does  not  become  relevant 
when  the  conformations  are  partially  defined  by 
the  requirements  imposed  by  our  template. 

In  the  present  research,  the  first  step  of  the 
calculation  is  associated  with  the  conformational 
search  of  the  fully  relaxed  isolated  molecules. 
Dealing  with  very  flexible  zwitterionic  molecules, 
in  vacuo  geometry  optimization  leads  to  severely 
curved  structures,  stabilized  by  proton  transfer 
from  the  positive  to  the  negative  end.  In  order  to 
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b3 

FIGURE  1.  GABAb  analogs  included  in  the  comparative  analysis:  bl 2,  rigid  analog;  bl 3,  benzofuran  derivative  of 
baclofen  whose  X-ray  structural  data  have  been  considered  in  the  text.  Torsional  angles  are  shown  for  bl 2: 
r1  =  NC1C2C3,  t 2  =  C^CgC^,  r3  =  C2C3C405,  t4  =  C2C3C406,  r5  =  C.|C2C3Y  (Y  =  third  substituent  of  C4,  not 
shown),  t6  =  NC1C207,  t7  =  C^C2G7CQ.  Red,  O;  blue,  N;  light  blue,  C;  green,  P;  yellow,  S. 


avoid  this  effect,  which  is  known  to  be  unreal  from 
the  consideration  of  the  requirements  imposed  by 
the  morpholine  acetic  acid  derivative,  the  struc¬ 
tures  have  been  partially  frozen  to  the  torsional 
angles  defined  by  the  rigid  moiety  of  the  semirigid 
analog.  Thus,  in  order  to  perform  a  complete  search 
for  the  accessible  conformational  space  in  the  inter¬ 
action  site,  the  unfrozen  torsional  angles  have  been 
varied  in  10°  steps,  from  0°  to  360°,  with  complete 
relaxation  of  the  other  variables  at  each  fixed  ge¬ 
ometry.  In  this  way,  starting  with  the  morpholine 
acetic  acid  derivative  (Fig.  1),  for  which  the  r} 
value  is  well  defined,  the  values  of  t2  associated 
with  minimum  energy  were  determined.  For  these 


Tj,  r2  values  imposed  on  the  other  structures,  the 
other  torsional  angles  have  been  calculated  by 
means  of  a  complete  search  over  the  conforma¬ 
tional  space.  On  the  basis  of  our  previous  experi¬ 
ence,  derived  from  the  comparison  of  semiempiri- 
cal  (PM3,  AMI,  MNDO)  [20]  and  ab  initio  (G94/6- 
31  +  G(d,p))  [19]  calculations  for  the  conforma¬ 
tional  analysis  of  GABAa  derivatives,  which  has 
shown  that  both  the  semiempirical  and  ab  initio 
methodologies  lead  to  similar  results  when  per¬ 
formed  in  vacuo  [17],  we  have  chosen  PM3  for  the 
conformational  analysis.  The  lower  computational 
requirements  associated  with  this  methodology  al¬ 
low  a  more  detailed  analysis  of  the  conformational 
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FIGURE  1.  (Continued) 


space.  Ab  initio  calculations  are  characterized,  for 
these  flexible  systems,  by  a  slow  convergence  to 
the  structure  of  minimum  energy,  which  is  located 
in  a  very  flat  region  of  the  potential  hypersurface. 

Different  environmental  conditions  have  been 
modeled  for  the  conformational  analysis  of  the 
morpholine  acetic  acid  derivative.  The  physiologi¬ 
cal  media,  which  is  the  surrounding  environment 
for  hydrophilic  interactions,  has  been  simulated  by 
water  (through  its  dielectric  constant)  in  the  frame¬ 
work  of  an  Onsager  approach  [18].  The  active 
conformation  for  lipophilic  interactions  has  been 
approached,  on  the  other  hand,  by  calculations  in 
vacuo.  Whereas  ab  initio  (G94/6-31  +  G(d,p))  cal¬ 
culations  [18]  have  been  performed  in  the  first 
case,  semiempirical  PM3  calculations  [20]  have 
been  done  in  the  second.  Far  from  being  arbitrary. 


this  decision  is  twofold.  On  one  side,  we  have 
found  that  solvent-simulated  PM3  calculations  [21] 
do  not  lead  to  reliable  results  (in  comparison  with 
those  derived  from  solvent-simulated  ab  initio 
ones).  On  the  other  side,  we  are  interested  in 
analyzing  the  confidence  of  the  semiempirical  PM3 
calculations  that  will  be  used  throughout  this  re¬ 
search.  They  have  been  chosen  on  the  basis  of  the 
knowledge  of  the  computational  cost  that  would 
demand  G94/6-31  +  G(d,p)  calculations  for  the 
evaluation  of  the  torsional  barriers  around  the 
flexible  bonds,  and  on  the  similarity  of  the  results 
from  both  approaches  when  used  for  the  analysis 
of  GABAa  analogs  [17]. 

The  similarity  analysis  that  followed  the  confor¬ 
mational  search  has  been  mainly  based  on  the 
comparison  of  the  structural  parameters  and  on 
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FIGURE  1.  (Continued) 


the  use  of  computer  graphic  molecular  superimpo¬ 
sition  in  order  to  compare  the  structures  as  a 
whole. 


Results  and  Discussion 

STRUCTURAL  CHARACTERISTICS  OF  THE 

MORPHOLINE  ACETIC  ACID  DERIVATIVE,  B12 

Regardless  the  calculation  methodology  and  the 
simulated  environment,  the  conformational  analy¬ 
sis  of  the  semirigid  analog  indicates  the  stabiliza¬ 
tion  of  two  minima,  mainly  defined  by  the  value 
of  the  C1C2C3C4  (t2)  torsional  angle. 

The  calculated  torsional  angles  derived  from 
both  methodologies  (Fig.  1,  b!2)  are  given  in  Table 


II.  As  previously  mentioned,  the  value  of  r2  be¬ 
comes  the  most  relevant  calculated  data  to  discern 
the  conformation  of  the  GABA  chain  in  the  bind¬ 
ing  site,  as  ru  r6,  and  t7,  are  defined  by  rigidiza- 
tion.  r3  and  r4,  on  the  other  hand,  correspond  to 
very  flexible  angles.  Interatomic  distances  and  pla¬ 
nar  angles  are  comparatively  shown  in  Figure  2. 

From  the  conformational  analysis  of  the  mor- 
pholin  acetic  acid  derivative  two  results  can  be 
inferred: 

■  The  conformation  of  the  GABA  chain  in  the 
GABAb  analogs  is  defined  by  values  of  the 
torsional  angles  close  to  r ^  =  170°  and  r2  — 
60°/  -  60°.  The  energy  involved  in  the  con¬ 
version  between  the  most  stable  conformers 
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b13 

FIGURE  1.  (Continued) 

A  Hj  (Table  III)  is  not  large  enough  to  disre¬ 
gard  intermediate  values  of  r2  for  the  defini¬ 
tion  of  the  active  conformation. 

■  The  close  agreement  between  the  structural 
parameters  calculated  by  both  methodolo¬ 
gies  demonstrate  that  solvent  simulation 
does  not  become  relevant  for  the  analysis  of 
the  conformational  space  of  the  other  deriva¬ 
tives  of  Figure  1  when,  according  to  our 
interest  in  the  active  conformation,  the  struc¬ 
tures  are  frozen  to  the  requirements  imposed 
by  the  semirigid  analog.  For  this  conforma¬ 
tion,  which  is  straight  in  the  amonium  side, 
the  interaction  between  the  charges  in  oppo¬ 
site  ends  followed  by  proton  migration  is 
precluded.  The  straight  conformation  of  the 
semirigid  analog  reopens  the  question  of 
whether  a  polar  environment  might  be  sur¬ 
rounding  the  interaction  site  and  shows  how 
the  results  of  the  calculations  can  help  to 


TABLE  II _ 

PM3  calculated  torsional  angles  of  the  GABAb 
analogs.3 


Ti 

T2 

T3 

U 

^5  ^6 

T7 

bl2 

178 

60 

158 

-22 

53 

-60 

171 

74 

167 

-14 

52 

- 62 

159 

-59 

-144 

36 

51 

59 

-177 

-54 

-160 

24 

57 

-66 

BO 

170 

60 

150 

-30 

170 

-60 

-146 

35 

B1 

170 

60 

152 

-28 

170 

-60 

-129 

49 

B2 

170 

60 

157 

-23 

48 

54 

170 

-60 

-158 

23 

45 

58 

170 

60 

-30 

96 

—  143* 

B3 

170 

60 

-142 

-16 

100* 

170 

-60 

20 

146 

-93* 

170 

-60 

-80 

45 

160* 

170 

60 

-151 

-22 

90** 

B4 

170 

60 

-36 

91 

-150** 

170 

-60 

-92 

37 

150** 

170 

-60 

22 

151 

-90** 

B5 

170 

60 

-135 

-18 

100 

170 

-60 

170 

-65 

50 

170 

60 

0.0 

-30 

80+  47 

58 

B6 

170 

-60 

—  116 

19 

130+  44 

54 

170 

-60 

-175 

47 

-60+  36 

52 

B7 

170 

60 

-160 

-42 

75  56 

60 

170 

-60 

157 

-79 

37  46 

95 

170 

60 

156 

-25 

46 

97 

b8 

170 

-60 

-150 

30 

43 

98 

170 

60 

-168 

-34 

-80  +  + 

B9 

170 

60 

165 

-57 

50  +  + 

170 

-60 

-124 

11 

120  +  + 

170 

-60 

40 

175 

-70+  + 

170 

60 

-151 

-20 

90+  + 

170 

60 

102 

-27 

-140  +  + 

blO 

170 

-60 

-92 

37 

150+  + 

170 

-60 

21 

151 

-90+  + 

170 

60 

-30 

-99 

-143+  + 

bll 

170 

-60 

31 

153 

-89+  + 

170 

-60 

97 

24 

143  +  + 

aln  all  the  cases  but  b12,  ^  and  r2  are  kept  fixed  to  170° 
and  60°/  -  60°,  the  latter  being  close  to  the  one  that  result 
from  the  optimization.  Ab  initio  results  for  b12  are  given  in 
italics.  Atoms  considered  in  the  definition  of  r5:  (*);  H;  (**); 
primary  C;  (+);  OH;  (+  +  );  secondary  C.  Results  for  more 
than  one  minimum  are  given. 
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FIGURE  2.  Comparison  of  the  structural  descriptors 
(bond  distances  and  planar  angles)  that  result  from  PM3 
and  solvent-simulated  ab  initio  calculations  (italics)  for 
b12.  Bond  distances  are  indicated  in  the  conformation 
associated  with  t2  =  60°,  planar  angles  in  the 
conformation  defined  by  r2  =  -60°.  Torsional  angles 
are  compared  in  Table  II. 

understand  the  characteristics  of  the  interac¬ 
tion  in  the  binding  site. 

ANALYSIS  OF  THF  “SEMIRIGID" 

BACLOFEN  ANALOGS 

As  we  are  interested  in  the  active  conformation 
of  the  GABAb  analogs,  which  is  partially  defined 
by  M2,  we  have  frozen  the  value  of  r3  to  170°, 
which  is  an  intermediate  value  between  those  as¬ 
sociated  with  the  two  minima  calculated  for  bl2. 
In  order  to  perform  the  conformational  analysis  of 
the  baclofen  analogs,  including  saclofen  and  pha- 
clofen,  we  have  worked  on  the  other  structural 
parameters  scanning  the  conformational  space  by 


means  of  a  360°  rotation  of  r2,  in  10°  steps,  with 
complete  relaxation  of  the  other  parameters. 

In  agreement  with  the  results  derived  from  the 
study  of  bl2  two  minima  were  found,  defined 
by  values  of  r2  close  to  +60/-  60.  The  energy 
involved  in  the  mutual  interconversion  between 
both  conformers  is  smaller  than  the  one  calculated 
for  bl2. 

It  should  be  mentioned  that,  when  bl2  is  ana¬ 
lyzed  by  means  of  a  complete  scanning  of  the 
active  space  through  the  rotation  of  r2  in  360°,  a 
third  minimum  develops  at  -120°.  However,  this 
minimum  implies  a  significant  distortion  of  the 
morpholine  ring.  This  fact  supports  the  conclusion 
that  values  close  to  either  60°  or  -60°  in  r2, 
together  with  170°  in  r,,  will  define  the  conforma¬ 
tion  of  the  GABA  chain  in  the  binding  site. 

For  the  analysis  of  the  other  torsional  angles  a 
similar  procedure  has  been  followed.  Keeping  r3 
fixed  to  170°,  r2  has  been  kept  to  either  60°  or 
-60°,  a  value  that  is  close  enough  to  the  one  that 
results  from  the  previous  optimization.  For  the 
semirigid  geometries  thus  defined,  the  conforma¬ 
tional  space  has  been  again  scanned  by  means  of 
a  360°  rotation  of  t3  in  10°  steps  with  complete 
relaxation  of  the  other  parameters.  This  rotation 
implies  the  simultaneous  modification  of  r3  and 
r4.  The  resulting  conformations,  described  by  the 
torsional  angles,  are  given  in  Table  II.  The  opti¬ 
mized  values  of  r3,  r4,  and  r5  (Table  II)  are  diffi¬ 
cult  to  compare,  as  they  involve  different  groups, 
with  either  two  or  three  atoms  bonded  to  carbon, 
phosphorus,  or  sulfur  atoms.  It  gives,  however, 
good  agreement  for  the  carboxylatc  containing 
molecules.  In  relation  to  the  baclofen  analogs,  r6 
and  r7  have  been  also  considered.  The  most  stable 
conformation  corresponds  to  r7  =  60°.  The  energy 
difference  between  this  conformation  and  the  one 
imposed  by  the  semirigid  analog,  defined  by  r7  = 
-60°  (Table  II)  amounts  to  4  keal/mol,  with  an 
associated  rotational  barrier  of  6.0  kcal/mol.  The 
latter  is  in  close  agreement  with  the  —65.3  value 
determined  by  X-ray  diffraction  analysis  for  the 
7-methyl  benzofuran  analog  of  baclofen  (bl3.  Fig. 
1)  [12]. 

COMPARATIVE  ANALYSIS  INCLUDING  ALL  THE 

COMPOUNDS  OF  THE  SET 

The  previously  described  conformational  analy¬ 
sis,  based  on  the  partial  rigidization  of  the 
molecules  to  the  parameters  imposed  by  bl2,  has 
been  extended  to  the  other  molecules  of  the  series 
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(bO,  bl,  b3,  b4,  b5,  b9,  blO,  bll).  Data  reported  in 
Table  II  demonstrate  that  the  previous  discussion 
is  not  restricted  to  the  baclofen  analogs,  but  also 
applies  to  the  other  elements  of  the  set. 

The  first  step  of  the  optimization,  for  t1  fixed  to 
170°,  results  in  two  minima,  defined  by  r2  values 
close  to  60  and  -60,  respectively.  The  energy 


difference  between  them  (AE,  Table  III)  does  not 
allow  discernment  among  both  possibilities  for  the 
definition  of  the  characteristics  of  the  active  con¬ 
formation.  In  relation  to  the  torsional  barriers,  the 
largest  value  among  the  flexible  derivatives  corre¬ 
sponds  to  bO,  where  the  height  is  associated  to  the 
simultaneous  rotation  about  the  CC  bond  that 


TABLE  III _ 

Distance  [A]  from  the  positive  center  to  each  of  the  atoms  that  define  the  negative  center.3 


T2 

d(N —  O) 

O 

i 

z 

~o 

d(N  —  O) 

d(N  —  X) 

d 

A  E 

(kcal /mol) 

A  Hi 

(kcal  /  mol) 

AH  2 

(kcal  /  mol) 

b12 

60 

5.58 

3.83 

4.35 

4.70 

0.29 

8 

-60 

5.48 

3.75 

4.27 

4.61 

0.00 

bO 

60 

5.55 

3.81 

4.32 

4.68 

1.83 

12 

9 

-60 

5.51 

3.77 

4.28 

4.64 

0.00 

9 

bl 

60 

5.59 

3.92 

4.36 

4.75 

0.00 

12 

3 

-60 

5.35 

3.76 

4.21 

4.55 

3.71 

3 

b2 

60 

5.56 

3.80 

4.36 

4.68 

1.67 

8 

9 

-60 

5.48 

3.78 

4.31 

4.63 

0.00 

9 

b3 

60 

5.61 

3.83 

4.61 

4.72 

3.85 

4 

8 

60 

5.51 

3.80 

4.59 

4.65 

3.58 

-60 

5.50 

3.60 

4.49 

4.55 

0.77 

8 

-60 

5.41 

3.74 

4.46 

4.57 

0.00 

b4 

60 

5.68 

3.74 

4.56 

4.71 

0.00 

8 

8 

60 

5.54 

3.79 

4.55 

4.66 

0.03 

-60 

5.34 

3.80 

4.55 

4.57 

0.03 

8 

-60 

5.35 

3.74 

4.56 

4.54 

0.01 

b5 

60 

5.65 

3.90 

5.47 

4.67 

4.77 

2.43 

6 

9 

-60 

5.26 

3.86 

4.82 

4.53 

4.56 

0.00 

9 

b6 

60 

5.56 

3.80 

4.62 

4.68 

0.00 

7 

11 

-60 

5.32 

3.79 

4.56 

4.55 

2.26 

10 

-60 

5.24 

3.80 

4.59 

4.52 

4.37 

b7 

60 

5.63 

3.90 

5.45 

4.63 

4.76 

0.00 

7 

8 

-60 

5.40 

3.77 

5.62 

4.62 

4.58 

1.75 

9 

b8 

60 

5.67 

3.96 

4.44 

4.81 

0.59 

9 

14 

-60 

5.27 

3.97 

4.32 

4.60 

0.00 

13 

b9 

60 

5.61 

3.77 

4.43 

4.69 

1.72 

8 

10 

60 

5.63 

3.70 

4.42 

4.66 

0.00 

-60 

5.24 

3.80 

4.58 

4.52 

3.50 

10 

-60 

5.28 

3.73 

4.60 

4.51 

3.64 

blO 

60 

5.68 

3.81 

4.60 

4.74 

2.79 

7 

9 

60 

5.67 

3.82 

4.60 

4.74 

3.94 

-60 

5.35 

3.72 

4.50 

4.53 

0.65 

8 

-60 

5.31 

3.70 

4.51 

4.51 

0.00 

bll 

60 

5.58 

3.79 

4.61 

4.68 

3.57 

7 

8 

-60 

5.29 

3.71 

4.41 

4.50 

0.00 

8 

-60 

5.38 

3.68 

4.43 

4.53 

1.06 

ad 

=  distance  from  the  N  to  the  mean  point  of  the  overlapping  negative  charges.  A E 

=  relative  PM3  calculated  energy  differences 

between  the  conformations  defined  by  t2.  A H1(  A H2  -  energy  barriers  around  t1(  t2,  respectively. 
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defines  the  r3  value.  Rotational  barriers  around 
the  CC  bond  associated  with  r3  are  also  given  in 
Table  III. 

In  order  to  learn  about  the  structural  require¬ 
ments  associated  with  the  binding  affinity,  we 
have  superimposed  the  structures  that  result  from 
the  optimization,  constrained  in  the  ru  r2  values, 
with  our  template,  defined  by  bl2.  The  graphical 
superposition  was  preceded  by  the  analytical  com¬ 
parison  of  the  distance  between  the  positive  and 
negative  centers  of  charge,  for  the  conformations 
defined  by  r2  =  60°/  -  60°  (Table  III).  The  distance 
between  the  positive  nitrogen  and  two  of  the  nega¬ 
tive  oxygen  atoms  is  the  same,  within  0.25  A,  for 
the  molecules  under  consideration,  which  is  in¬ 
dicative  of  the  fact  that  the  centers  of  charge  will 


be  easily  overlapped.  If  the  mean  position  between 
the  two  centers  of  negative  charge  is  considered, 
the  calculated  distance  is  in  agreement  with  the 
one  observed  by  X-ray  crystallography  for  the  fu- 
ran,  thienyl,  benzofuran,  and  benzothiophene  ana¬ 
logs  of  baclofen  (4.6  A)  [12,  22-24]  (bl3.  Fig.  1) 
and  with  the  one  calculated  by  molecular  dynam¬ 
ics  for  the  phosphinic  antagonist  CGP55845  [15]. 

Figure  3  shows  the  results  of  the  graphical  su¬ 
perposition,  exemplified  by  the  superposition  of 
bl2  with  bO,  b7,  b8,  and  b9.  The  conformations 
defined  by  r2  =  60°/  -  60°  have  been  compara¬ 
tively  considered.  Graphic  superpositions  for  the 
other  GABAb  analogs  are  available  upon  request. 

In  agreement  with  the  previous  analytical  com¬ 
parison,  the  positive  and  negative  centers  of  charge 


bl  2-bO  [t2»-60] 


b12-b7  [t2"-60] 


FIGURE  3.  Superposition  of  GABAb  analogs  with  the  template,  b12.  Although  the  superposition  has  been  analyzed 
for  all  the  molecules,  4  out  of  1 1  are  given  as  example.  Superposition  has  been  analyzed  for  both  stable  conformations 
defined  by  r2  =  60°/ "  60°. 
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FIGURE  3.  (Continued) 


overlap  for  all  the  molecules  of  the  series.  For  the 
case  of  the  baclofen  analogs,  the  aromatic  sub¬ 
stituent  in  /3  position  overlaps  with  the  oxygen 
atom  of  the  morpholine  ring.  This  superposition 
involves  the  R-configuration  of  the  baclofen 
derivatives,  which  are  known  to  be  the  active 
species  for  receptor  binding.  For  the  morpholine 
acetic  acid  derivatives,  on  the  other  hand,  the 
active  configuration  is  S  on  C2  of  the  morpholine 
ring.  Estereoisomeric  requirements  should  also  be 
considered,  thence,  in  the  definition  of  the  phar- 
macophoric  pattern  of  GABAb  analogs.  Lipophilic 
substitutions  in  /3  position  have  been  extensively 
investigated.  It  has  became  evident  that,  in  addi¬ 
tion  to  the  estereoisomerism,  requirements  related 
to  the  size  of  the  substituent  have  to  be  met.  It  has 
been  found  [15]  that  a  size  over  a  limit,  defined  by 


two  methyl  groups  bonded  to  C5  of  the  morpho¬ 
line  ring  decrease  the  activity  in  the  series  of  the 
morpholine  acetic  acid  derivatives. 

On  the  basis  of  the  previous  analysis  we  are 
able  to  suggest  a  pharmacophoric  pattern  for 
GABAb  analogs  which  is  not  restricted  to  a  unique 
congeneric  family.  It  is  defined  by:  (1)  a  positive 
center  associated  with  an  ammonium  group;  (2)  a 
negative  center  defined  by  two  oxygen  atoms;  (3)  a 
distance  between  them  of  4.6  ( —  +  0.1)  A,  measured 
from  the  nitrogen  atom  to  the  mean  point  of  the 
two  overlapping  oxygens;  (4)  a  straight  conforma¬ 
tion  in  the  positive  end;  (5)  a  torsional  angle  r2 
restricted  to  values  between  60°  and  —60°;  and 
(6)  a  configuration  in  the  /3  position  of  the  GABA 
chain  superimposable  on  the  R-enantiomer  of  the 
baclofen  analogs.  The  characteristics  of  this  pattern 
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b12-b9  [t2*40] 

FIGURE  3.  (Continued) 

capable  of  accessing  the  receptor  site.  The  question 
remains  pen  and  would  probably  need  the  design, 
synthesis,  and  biological  evaluation  of,  at  least,  a 
new  compound,  rigid ified  in  the  negative  end. 

CONFORMATIONS  OF  THU  GAIU  MOLECULE 
IN  THE  GABAa  AND  GABAb  RECEPTOR  SITES 

Table  IV  and  Figure  4  show  the  differences  in 
the  conformations  of  the  GABA  molecule  when 
interacting  with  either  GABAa  or  GABAb  receptor 
sites,  iq  and  r2  define  important  differences  be¬ 
tween  both  conformations.  Whereas  the  value  of  r1 
implies  opposite  orientations  of  the  positive  end,  it 
should  be  remarked  that  the  value  r2  =  180°,  which 
corresponds  to  minimum  energy  for  the  GABAa 
agonists,  belongs  to  a  point  of  maximum  energy  in 
the  conformational  space  of  the  GABAb  analogs. 


indicate  that  r2  becomes  relevant  for  its  definition. 
Although  60°  and  -  60°  appear,  at  first  glance,  as 
equivalent,  they  do  not  define  equivalent  confor¬ 
mations  but  mirror  images,  and  only  one,  defined 
by  a  value  of  t2  between  these  limits,  will  be 


TABLE  IV _ 

Torsional  angles  for  the  GABA  molecule  at  the 
GABAa  and  GABAb  receptor  sites.3 


Ti 

T2 

t3 

t4 

d  (A) 

GABA-A 

15 

180 

180 

0 

>  5.30 

GABA-B 

170 

60 

150 

-30 

4.60 

170 

-60 

-150 

30 

ad  =  distance  from  the  positive  to  the  negative  end  that 
defines  the  pharmacophore. 
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a) 


GABAa  GABAb 

FIGURE  4.  Superposition  of  the  GABA  molecule  in  the 
conformations  associated  with  the  GABAa  and  GABAB 
receptor  sites,  (a)  GABAb  conformation  for  r2  =  60°. 

(b)  GABAb  conformation  for  r2  =  -60°. 


Conclusions 

A  conformational  study,  followed  by  the  identi¬ 
fication  of  similar  fragments  taking  into  account 
the  flexibility  of  the  molecules,  has  led  us  to  sug¬ 
gest  a  pharmacophoric  pattern  for  GABAb  binding 
affinity,  revealing  the  following  features:  (1)  a  pos¬ 
itive  center  associated  with  an  ammonium  group; 

(2)  a  negative  center  defined  by  two  oxygen  atoms; 

(3)  a  distance  between  them  of  4.6  A,  measured 
from  the  nitrogen  atom  to  the  mean  point  of  the 
overlapping  negative  oxygens;  (4)  a  straight  con¬ 
formation  in  the  positive  end;  (5)  a  torsional  angle 
r 2  restricted  to  values  between  60°  and  —  60°;  and 
(6)  a  configuration  in  the  /3  position  of  the  GABA 


chain  superimposable  on  the  R-enantiomer  of  the 
baclofen  analogs. 

The  main  difficulty  of  this  research  was  associ¬ 
ated  with  the  flexibility  of  the  structures  and  the 
lack  of  fully  rigid  GABAb  analogs.  This  fact  leaves 
a  question  open  about  the  accuracy  of  the  defini¬ 
tion  of  the  conformational  characteristics  of  the 
negative  end.  However,  we  have  approached  the 
pharmcophore  closely  enough  to  be  able  to  discern 
the  differences  in  the  conformation  of  the  GABA 
molecule  at  the  GABAa  and  GABAb  receptor  sites. 

Present  research  is  oriented  to  the  design  and 
synthesis  of  completely  rigid  analogs  and  to  the 
homology  modeling  of  the  receptor  on  the  basis  of 
the  comparison  of  the  amino  acid  sequences  of 
GABA  and  glutamate  metabotropic  receptors. 
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ABSTRACT:  We  report  on  optimal  molecular  connectivity  descriptors  for  nitrogen 
atoms  in  amines  for  use  in  structure-property  correlations.  The  descriptors  represent 
generalized  molecular  connectivity  indices  with  adjusted  diagonal  entries  in  the  adjacency 
matrices  of  the  corresponding  molecular  graphs,  such  that  the  standard  error  in  a 
regression  for  boiling  points  in  a  set  of  amines  is  minimized.  Advantages  of  the 
so-optimized  descriptors  for  multivariate  regression  analysis  in  structure-property- 
activity  studies  are  discussed.  ©  1998  John  Wiley  &  Sons,  Inc.  Int  J  Quant  Chem  70:  1209-1215, 
1998 


Introduction 

One  of  the  critical  initial  steps  in  modeling 
structure-property  and  structure -activity 
relationships  is  the  selection  of  molecular  descrip¬ 
tors  to  be  used  in  such  models.  In  the  earlier 
development  of  QSAR,  the  quantitative  struc¬ 
ture-activity  relationship  studies  [1],  the  selected 
molecular  properties  have  been  often  used  as 
molecular  descriptors.  While  this  is  quite  legiti¬ 
mate,  such  an  approach  has  been  characterized  as 
structure-cryptic  [2],  because  it  expresses  biologi- 
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cal  activities  in  terms  of  molecular  properties,  sim¬ 
pler  and  presumably  better  understood;  neverthe¬ 
less,  such  an  approach  does  not  offer  direct  insight 
on  the  structure-property  relationship.  The  suc¬ 
cess  of  such  approach  reflects  the  situation  that 
the,  although  yet  unknown,  same  structural  factors 
may  play  the  critical  role  in  different  molecular 
properties  [3]. 

Chemical  graph  theory  [4]  advocates  an  alterna¬ 
tive  approach  to  QSAR  and  to  the  structure-prop¬ 
erty-activity  relationship  studies  based  on  math¬ 
ematically  derived  molecular  descriptors.  Such 
descriptors,  often  referred  to  as  topological  indices, 
include  the  well-known  Wiener  index  W  [5],  the 
Hosoya  index  Z  [6],  and  the  connectivity  index  x 
[7],  the  latter  one  being  the  most  widely  used  [8]. 
The  Wiener  index  counts  the  number  of  carbon 
atoms  on  each  side  of  bond  in  a  molecule,  the  Z 
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index  counts  nonadjacent  bonds  in  the  carbon 
skeleton  of  a  molecule,  while  *  is  a  bond  additive 
quantity  in  which  bonds  of  different  types  (involv¬ 
ing  primary,  secondary,  tertiary  carbons)  are  given 
different  weights.  As  initially  introduced,  all  three 
indices  (and  the  same  is  true  for  many,  but  by  no 
means  all,  topological  indices)  were  defined  for 
carbon  molecular  skeletons.  This  leaves  the  prob¬ 
lem  of  their  generalization  to  heteroatomic 
molecules  open.  Kier  and  Hall  [9]  recognized  the 
importance  of  extending  the  definition  of  the  con¬ 
nectivity  index  to  heteroatoms  and  proposed  the 
so-called  valence  connectivity  indices  for  which 
the  rules  are  given  as  to  how  to  represent  het¬ 
eroatoms.  The  concept  of  the  valence  connectivity 
indices  also  extends  to  "higher"  connectivity  in¬ 
dices  [10].  Kupchik  considered  an  alternative  route 
to  generalized  connectivity  indices  by  modifying 
empirically  the  difference  in  bond  length  between 
the  heteroatom-carbon  bond  and  the  carbon- 
carbon  bond  [11].  The  modifications  were  based  on 
the  differences  in  the  covalent  radius  between  the 
heteroatom  and  the  carbon  atoms.  The  correspond¬ 
ing  extensions  of  W  and  Z  for  heteroatoms  have 
not  been  yet  considered,  but,  recently,  weighted 
paths  for  heteroatoms  were  considered  [12]. 


Representation  of  Heteroatoms 

It  is  apparent  that  in  order  to  cover  the  in¬ 
creased  variation  in  the  structural  features  that  the 
heteroatom  introduces  molecules  involving  het¬ 
eroatoms  require  additional  descriptors.  The  ap¬ 
proach  of  Kier  and  Hall  [8]  implies  that  het¬ 
eroatoms  differently  weigh  bonds  of  different 
kinds.  The  characterization  of  heteroatoms  by 
graph-theoretical  rather  than  physicochemical 
schemes  has  advantages,  as  it  is  independent  of 
whether  selected  experimental  data  are  available 
and,  if  available,  whether  they  are  reliable. 

An  alternative  approach  of  modifying  the  con¬ 
nectivity  indices  so  that  they  can  better  character¬ 
ize  the  presence  of  heteroatoms  to  that  of  Kier  and 
Hall  was  recently  outlined  for  chlorine  atoms  in 
clonidine  compounds  [13].  The  approach  may  be 
viewed  as  analogous  to  the  early  modifications  of 
the  Hiickel  molecular  orbitals  method  for  het¬ 
eroatoms  [14],  while  the  approach  of  Kier  and  Hall 
would  be  analogous  to  the  approach  of  Slater  for 
modification  of  simple  atomic  orbitals  used  in  the 


early  quantum  chemical  calculations.  A  way  to 
differentiate  heteroatoms  in  a  Hiickel  matrix,  or 
the  adjacency  matrix  of  a  molecular  graph,  is  to 
modify  the  diagonal  and  the  off-diagonal  ele¬ 
ments.  An  earlier  study  showed  that  modification 
of  off-diagonal  elements  has  produced  a  small 
effect  [15]. 

Changes  in  the  diagonal  elements  of  adjacency 
matrices,  as  has  already  been  seen  on  chlorine 
atoms  in  clonidine-type  compounds,  influences  the 
magnitudes  of  computed  weighted  paths  more 
strongly  [13].  In  Table  I,  we  show  how  the  paths  of 
length  one,  paths  of  length  two,  and  paths  of 
length  three  vary  as  we  change  the  diagonal  entry 
corresponding  to  nitrogen  in  a  graph  of  1- 
aminohexane.  The  same  table  applies  to  other  het¬ 
eroatoms  placed  at  the  end  of  the  chain  of  six 
carbon  atoms,  for  example,  it  equally  applies  to 
1-hexanol.  The  difference  between  1-hexanol  and 
fl-aminohexane  will  be  in  different  corresponding 
values  for  the  parameter  y. 

The  weighted  paths  for  rz-heptylamine  were  ob¬ 
tained  from  the  ALL  PATH  program  [16]  by  re¬ 
placing  the  zero  diagonal  entries  in  the  input  adja¬ 
cency  matrix  with  the  value  of  y  selected.  The 
weight  of  bonds  in  the  calculation  of  the  connectiv¬ 
ity  index  are  defined  as  1/  {m^~n  f  where  m  and  n 
are  the  valences  of  the  incident  atoms  (vertices). 
By  introducing  parameters  x  and  y  to  be  assoc¬ 
iated  with  atoms  of  different  kinds,  here,  car¬ 
bon  and  nitrogen  atoms,  respectively,  the  weight 
of  bond  ( m,n )  changes  from  1  /  yf(m  -  n)  to 
1/  ]/[(ni  +  x)  •  (n  4-  y)]  . 


TABLE  I _ 

Variation  of  path  numbers  1 TT  ,  2  77  ,  and  3  tt  for 
1-aminohexane  as  a  function  of  the  diagonal  entry 
for  the  nitrogen  atoms  (diagonal  entries  for  carbon 
atoms  are  assumed  zero). 


y 

i 

TT 

2 

TT 

3 

TT 

0.5 

3.2845 

1 .3922 

0.5711 

0 

3.4142 

1.4571 

0.6036 

-0.25 

3.5236 

1.5118 

0.6309 

-0.50 

3.7071 

1 .6036 

0.6768 

-0.75 

4.1213 

1.8107 

0.7803 

-0.80 

4.2883 

1 .8941 

0.8221 

-0.85 

4.5326 

2.0164 

0.8832 

-0.90 

4.9432 

2.2216 

0.9858 

-0.95 

5.8694 

2.6847 

1.2175 
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In  Table  I,  we  show  how  the  count  of  the  paths 
changes  as  y  decreases  from  small  positive  values 
and  approaches  the  limiting  value  of  - 1  (assum¬ 
ing  x  =  0).  In  general,  both  x  and  y  will  change. 
For  example,  as  we  will  see  in  the  next  section,  the 
values  x  =  1.25  and  y  =  -0.65  are  found  as  opti¬ 
mal  for  carbon  and  nitrogen  atoms,  respectively, 
when  considering  the  boiling  points  in  amines.  The 
optimal  value  of  x  =  1.25  corresponds  to  a  change 
of  the  valence  for  carbon  atoms  in  a  molecular 
graph  from  their  values  of  1  and  2  to  the  values 
2.25  and  3.25  for  the  terminal  and  the  bridge  car¬ 
bon  atoms,  respectively.  Similarly,  the  formal  va¬ 
lence  of  the  terminal  nitrogen  atom  becomes  0.35 
instead  of  remaining  equal  to  1.  These  changes  in 
the  x  and  y  values  alter  the  relative  role  that  the 
carbon  and  nitrogen  atoms  play.  The  negative  val¬ 
ues  of  y  result  in  a  more  pronounced  role  of  the 
heteroatom.  However,  the  relative  role  of  the 
shorter  and  longer  paths  for  the  carbon  atom  or  the 
nitrogen  atom  have  not  changed. 

In  Figure  1,  we  plotted  2tt  against  V  and  3tt 
against  V  for  the  weighted  path  numbers  given  in 
Table  I.  Although  as  the  diagonal  entry  of  the 
adjacency  matrix  decreases  and  the  magnitudes  of 
the  path  numbers  V,  2i r,  and  3rr  increase,  their 
relative  importance  remains  unchanged  as  is  re¬ 
flected  by  the  linear  correlation  (shown  in  Fig.  2) 
between  them. 


6 


5 


? 


4 


3_1 - 1 - 1 - 1 - 1 - 1 - 1 - 1 - ! 

-1000  -800  -600  -400  -200  0  200  400  600 

y  x  10^3 


3 


1  -| - 1 - , - 1 - 1 - 1 - 1 - 1 - 1 

-1000  -800  -600  -400  -200  0  200  400  600 

y  x  10A-3 

FIGURE  1.  Variation  of  the  V  as  a  function  of  the 
diagonal  entry  for  the  nitrogen  atom  in  the  adjacency 
matrix  of  the  molecular  graph  of  1-aminohexane. 


2,80 

2,60 

2.40 

2,20 

2,00 

1,80 

1,60 

1.40 

1,20 

1,00 

3,00  3,50  4,00  4,50  5,00  5,50  6,00 

FIGURE  2.  Plot  of  the  magnitudes  of  ztt  against  V  for  different  values  of  y  (assuming  x  =  0),  showing  a  linear 
relationship. 
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Search  for  the  Optimal  Descriptors 

The  outlined  approach  illustrates  the  potential 
of  the  modified  descriptors  to  discriminate  atoms 
of  different  kinds.  The  problem  is  to  find  the 
optimal  values  for  x  and  y  to  be  used  in  struc¬ 
ture-activity  studies.  One  way  to  select  the  best 
values  of  x  and  y  is  to  minimize  the  standard 
error  s  in  the  multiple-regression  analyses  (MRA). 
Having  thus  defined  the  procedure  for  discrimina¬ 
tion  among  heteroatoms  by  having  the  diagonal 
entries  in  the  adjacency  matrices  as  variables,  we 
have  to  search  for  the  best  parameters  to  describe 
different  atoms  in  different  situations.  Several  fac¬ 
tors  will  influence  such  a  search: 

1.  The  selection  of  the  compounds. 

2.  The  selection  of  the  property. 

3.  The  selection  of  the  descriptors  to  be  used. 

All  these  factors  (and  possibly  other)  have  to  be 
examined.  Of  particular  interest  is  to  find  how 
sensitive  are  the  so-derived  parameters  on  the 
choice  of  compounds,  on  the  choice  of  properties, 
and  on  the  choice  of  the  descriptors  used  and  how 
they  depend  on  the  number  of  descriptors  used. 
The  early  results  on  alcohols  and  their  boiling 
points  [17]  are  encouraging.  In  the  case  of  alcohols 


and  their  boiling  points,  x  =  1.5  and  y  =  -0.85 
gave  the  best  single-variable  regression.  Here,  x 
and  y  stand  for  the  carbon  and  oxygen  atoms' 
diagonal  entries,  respectively.  The  use  of  the  above 
values  for  x  and  y  reduced  the  standard  error  s 
by  more  than  one-half  when  a  comparison  was 
made  with  the  similar  regression  in  which  the 
carbon  and  oxygen  atoms  were  not  discriminated. 

In  this  article,  we  report  the  corresponding  anal¬ 
ysis  for  the  regression  of  the  boiling  points  in 
nitrogen-containing  amines.  The  result  is  of  inter¬ 
est  possibly  in  the  future  search  for  optimal  de¬ 
scriptors  for  nitrogen  atoms  when  considering 
other  molecular  properties  and  when  considering 
other  nitrogen-containing  compounds. 


Regression  of  Boiling  Points 
for  Amines 

In  Table  II,  we  list  16  primary  amines,  their 
experimental  boiling  points  (as  reported  in  [8]),  the 
generalized  connectivity  indices  based  on  the  val¬ 
ues  of  x  =  1.25  and  y  =  -0.65,  the  calculated 
boiling  points,  and  the  difference  between  the  ob¬ 
served  and  the  calculated  values.  If  we  do  not 
differentiate  between  the  carbons  and  the  nitro¬ 
gens  (i.e.,  when  x  =  y  =  0),  the  regression  of  the 
boiling  points  against  the  connectivity  indices  has 


TABLE  II  _ _ _ _ 

The  weighted  paths  numbers  Ott),  experimental  boiling  points  (Bpexp),  calculated  boiling  points  (Bp  calcd), 
and  the  difference  between  the  experimental  and  calculated  values  (Diff.)  for  a  number  of  primary  amines, 
with  the  values  assumed  for  x  and  y  corresponding  to  the  optimal  values  x  =  1 .25  and  y  =  -  0.65. 


Molecule 

i 

7 T 

^Pexp 

^Pealed 

Diff. 

1-Aminononane 

3.46126 

201 

204.9 

-3.9 

1-Aminooctane 

3.15357 

180 

179.2 

+0.8 

1-Aminoheptane 

2.84588 

155 

153.5 

+  1.5 

1-Aminohexane 

2.53818 

130 

127.8 

+2.2 

1  -Amino-4-methylpentane 

2.46883 

125 

122.0 

+2.9 

2-Aminohexane 

2.39755 

114.5 

116.1 

-1.6 

1-Aminopentane 

2.23049 

104 

102.2 

+  1.8 

1  -Amino-2-methylbutane 

2.16893 

96 

97.0 

-1.0 

1  -Amino-3-methylbutane 

2.16114 

96 

96.4 

-0.4 

2-Aminopentane 

2.08986 

92 

90.4 

+  1.6 

3-Aminopentane 

2.09766 

91 

91.1 

-0.1 

2-Amino-2-methylbutane 

1.93152 

78 

77.2 

+0.8 

1-Aminobutane 

1 .92280 

77 

76.5 

+0.5 

1  -Amino-2-methylpropane 

1 .85344 

69 

70.7 

-1.7 

2-Aminobutane 

1.78217 

63 

64.7 

-1.7 

1-Aminopropane 

1.61511 

49 

50.8 

-1.8 
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for  the  standard  error  s  =  3.49°C.  It  is  desirable 
from  a  practitions'  point  of  view  to  aim  at  the 
standard  errors  in  the  boiling  points  below  1°C,  if 
possible.  The  standard  error  of  almost  3.5°C  is 
clearly  unsatisfactory.  It  is  not  surprising  that  the 
simple  connectivity  index  (x  =  0,  y  =  0)  cannot 
offer  good  results  for  alcohol  boiling  points.  For 
example,  the  boiling  points  for  2-methylpentamine 
(114.5°C)  and  4-methylpentamine  (125°C)  differ  by 
more  than  10°C,  yet  the  two  molecules  have  the 
same  molecular  graph  (when  the  carbon  atom  and 
the  nitrogens  are  not  discriminated). 

If  we  vary  x,  the  variable  describing  the  carbon 
atoms,  we  can  reduce  the  standard  error  s  some¬ 
what.  When  x  =  1.25  (see  Table  III),  s  is  reduced 
to  3.01°C.  However,  by  changing  the  diagonal  pa¬ 
rameter  for  nitrogen,  we  achieve  a  dramatic  im¬ 
provement  in  the  reduction  of  the  standard  error. 
When  y  =  -  0.65°C  (while  x  —  0),  the  standard 
error  s  becomes  2.08°C  To  find  the  optimal  values 
for  these  parameters  and  to  locate  the  minimum  in 
s  by  changing  both,  x  and  y,  we  screened  some  20 
points  in  the  x,  y  space.  As  we  see  from  Table  III, 
the  minimal  standard  error,  close  to  1.90°C,  is 
obtained  when  x  =  1.25  and  y  =  -0.65.  The  re¬ 
gression  equation  corresponding  to  the  so  optimal 
parameters  of  x  and  y  is 

BPcak  =  83.456  V  -  83.992 
s  =  1.91  r  =  0.9990  F  =  7298 

Here,  r  is  the  regression  coefficient,  F  is  Fisher 
ratio,  and  V  stands  for  the  weighted  path  of 
length  1  (i.e.,  the  modified  connectivity  index  us¬ 
ing  x  =  1.25  and  y  =  —0.65).  Figure  3  shows  a 


plot  of  calculated  boiling  points  against  the  experi¬ 
mental  values. 


Discussion 

The  first  thing  to  observe  is  that  the  variable 
graph  descriptors  (with  optimal  values  of  x  and 
y)  have  reduced  the  standard  error  s  in  the  regres¬ 
sion  of  the  boiling  points  in  primary  alkylamines 
by  almost  one-half  when  compared  with  the  re¬ 
gression  based  on  simple  molecular  graphs.  Typi¬ 
cally,  in  multivariate  regression  analysis,  a  reduc¬ 
tion  of  s  by  half  is  not  easy  to  achieve,  and  when 
reported,  often  it  is  achieved  by  introducing  one  or 
more  additional  molecular  descriptors.  In  contrast, 
we  obtained  improved  regression  still  using  a  sin¬ 
gle  molecular  descriptor.  The  disadvantage  of  us¬ 
ing  two  or  more  descriptors  over  a  single  descrip¬ 
tor  is  in  the  difficulties  in  the  interpretation  of  the 
results.  Generally,  molecular  descriptors  (topologi¬ 
cal  indices)  are  interrelated,  often  strongly  interre¬ 
lated.  Due  to  the  interrelatedness  of  the  descrip¬ 
tors,  it  is  not  possible  to  identify  the  separate  roles 
that  individual  descriptors  play.  Even  though,  re¬ 
cently,  the  question  of  interrelatedness  of  descrip¬ 
tors  has  finally  been  successfully  resolved  [18], 
nevertheless,  it  is  easier  to  interpret  correlations 
based  on  a  single  descriptor. 

Introduction  of  orthogonal  descriptors  [18]  re¬ 
quires  that  one  order  the  descriptors,  that  is,  prior¬ 
ities  the  variables.  This  can  sometimes  be  accom¬ 
plished  naturally  (like  in  the  case  of  ordering  paths 
according  to  their  length),  but  sometimes  the  or¬ 
dering  of  the  descriptors  is  not  apparent.  Hence, 


TABLE  III _ 

Variation  of  the  standard  error  s  as  a  function  of  x  and  y  when  weighted  paths  ^  are  used  as  single  descriptor 
in  the  regression  analysis,  x  and  y  represent  parameters  for  carbon  and  nitrogen  atoms,  respectively. 


y 


-  0.80 

-0.70 

-0.65 

-0.60 

-0.50 

0 

x  =  0 
x  =  1 .00 

2.6031 

1 .9632 

2.0812 

1.9131 

1 .9471 

2.1213 

3.4876 

x  =  1 .25 

2.5366 

1 .9463 

1 .9069 

1 .9506 

2.1336 

3.0082 

X  =  1 .50 

2.4697 

1.9293 

1.9137 

1 .9674 

2.1615 

x  =  1 .75 
x  =  2.00 

1 .9249 

1 .9270 

1.9511 
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FIGURE  3.  Calculated  boiling  points  against  the  experimental  boiling  points  for  the  amines  examined. 


use  of  a  single  descriptor  clearly  has  advantages  in 
this  respect.  An  optimal  descriptor  offers  a  direct 
structural  interpretation  for  the  property  in  terms 
of  the  dominant  variable.  In  addition,  single  de¬ 
scriptors,  or  a  set  of  structurally  related  descriptors 
[19],  facilitates  comparative  studies.  Recently,  in 
an  extensive  comparative  study  of  the  properties 
of  octanes  [20],  it  was  found  that  only  a  few 
molecular  descriptors  emerge  as  the  best  when 
compared  to  alternatives.  The  molecular  connectiv¬ 
ity  indices  have  been  found  often  among  those 
that  give  the  best  regressions.  With  the  outlined 
procedure,  we  have  initiated  an  important  direc¬ 
tion  in  the  search  for  best  molecular  descriptors  for 
structure-property  studies.  We  may  expect  not 
only  improved  regression  analyses  but,  hopefully, 
better  insights  into  the  dependence  of  molecular 
properties  on  the  shape,  the  size,  and  the  function¬ 
ality  of  molecule  forms. 
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